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The Quantification of the Frame of Reference in Labor- 
Management Communication ° 


Carl H. Weaver ” 


Ohio State University 


One of the barriers to certain kinds of com- 
munication between management and labor is 
the effect which the frame of reference has 
upon the concept evoked in members of one 
of these groups by a symbol used by a mem- 
ber of the other group. This research was an 
attempt to quantify the barrier posed by dif- 
ferences between the frames of reference of 
these two groups in the area of industrial 
relations by means of the semantic differential 
technique. 


Problem 


The problem and the theory of thé research 
were described in a preliminary report (14). 
Briefly, the problem concerned that type of 
communication in which management at- 
tempts to persuade labor to accept manage- 
ment’s point of view and thus effect a change 
in the behavior and attitude of the labor 
group. Accurate measurement of the frames 
of reference of these two groups might aid in 
appraising the common tendency of manage- 
ment to resort to another medium of com- 
munication when current methods fail. 

In terms of concepts and meaning, the 
frame of reference (the “apperceptive mass” 
of educators) appears to be about what se- 
manticists mean when they speak of a listen- 
er’s previous experience. A symbol may by 
explicit or written agreement stand for any- 
thing which may be agreed upon. In some 
scientific disciplines this agreement may be 
fairly precise. This is not the case, however, 
with most of the symbols used in communi- 


1 Based on a dissertation directed by Franklin H. 
Knower. 
2 Now at Central Michigan College. 


cation. Concepts developed in the way dem- 
onstrated by Fisher (4) and Hull (6) are 
personal and individual. They are built 
through the process of generalizing abstrac- 
tions from individual experiences with the ob- 
ject or process conceived. When a communi- 
cator uses a symbol to convey to a communi- 
catee a meaning which he has in his own 
mind, he can only evoke in the mind of his 
listener the concept which has been developed 
there through the listener's own past experi- 
ences with objects and processes which he has 
considered, consciously or not, to be related 
to that symbol. 

The semantic distance between the concept 
evoked in the communicatee and the concept 
intended by the communicator is a semantic 
barrier to communication. The concepts ac- 
cumulated by a person give him a frame of 
reference through which he observes and 
evaluates the objects and processes of the 
external world. The frame of reference in- 
fluences the concepts in their formation and 
change. One of the determinants of the for- 
mation of concepts is the group norm; group 
members who have internalized well the 
norms of a reference group are likely to hold 
similar concepts and similar frames of refer- 
ence toward objects and processes related to 
the norms of that group. In the research re- 
ported here, the frame of reference was es- 
tablished by measuring the connotative mean- 
ing of selected symbols in the area of labor- 
management relations. It was believed that 
the labor and management groups had social 
norms in this area which were different from 
each other; thus, members of the two groups 
should respond to the test in different ways. 





Method 


The semantic differential, developed by Osgood 
and others, was used to measure the meaning of 
concepts selected from the area of industrial rela- 
tions. The theory and technique of the semantic 
differential have been adequately described elsewhere 
(7, 8, 9, 12). 

The meaning of a symbol (called a “concept”) is 
measured by asking S to mark on a seven-point 
scale between two logically or psychologically op- 
posing terms the point at which he perceives the 
meaning to lie. The two extremes are called the 
“gradient.” This is an example: 


SENIORITY hot 123 45 67 cold 


In the research reported here, the S drew a circle 
around the number which he believed best repre- 
sented the meaning of the concept (seniority) on the 
gradient paired with it (hot-cold). 

The concepts for this study were chosen after in- 
terviews with state and regional labor leaders and 
management executives, and after surveying the 
writings of such authors as Bakke (1), Reynolds 
(11), Chamberlain (2), Peters (10), Heron (5), 
Walker (13), and others. Since the purpose of this 
research was to measure a barrier to communication, 
only those areas in which the positions of the labor 
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group and the management group could be expected 
to diverge were listed. These can be seen in Table 1. 
It was believed that the 21 areas listed there in- 
cluded most of the important diverging social norms 
of the two groups, and that they could be subsumed 
under two broad categories: reducing the discretion 
of management and the struggle for the loyalty of 
the worker. Ten symbols were selected and listed 
in the right-hand column of the table which might 
be expected to evoke concepts related to one or 
more of these areas. 

The gradients which were matched against these 
concepts were taken from a factor analysis reported 
by Osgood (8). They are listed in Table 2. Since 
this research was evaluative in character, only those 
gradients which had high loadings on the evaluative 
factor were selected. This list would not have been 
greatly changed had it been selected from the second 
factor analysis done by Osgood and Suci (9). The 
figures in parentheses after the gradients are the fac- 
tor loadings. The last gradient in the list was not 
used in the factor analysis but was used by Osgood 
in other studies. 

Since each concept was paired with each gradient, 
the pilot test consisted of 300 items, arranged in 
random order. The sheets were rotated and stapled 
together with a cover page of instructions and a 
final information sheet. This pilot test was ad- 


Table 1 


Derivation of the Concepts 





Areas of Opposing Norms 





A. Reducing the discretion of management 
. Wages and profits 
. Transfers 
. Promotions 
. Lay-offs 
. Hours 
. Settlement of grievances 


Concepts 


Seniority 


Grievance 


. Hiring (closed or union shop, former employees) 


. Discipline 

. Discharge 

. Arbitration 

. Rate of production (pace, speed-up) 


. Equal pay (elimination of competition) 


3. Job classification 


Struggle for the loyalty of the worker 
. Enforced union membership 
. Independent vs. international union 
. Industry-wide bargaining 
. Support of other unions 
. Working during a strike 
. Attending meetings 
. Collective vs. individual bargaining 
. Voting labor 


Arbitration 
Work quota 
Equal pay (for equal work) 


The closed shop 


The labor movement 
Working during a strike 
Individual bargaining 
Labor in politics 
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Table 2 


Evaluative Gradients Taken from Osgood and Suci’s Factor Analysis 


. good-bad (.88) 

. beautiful-ugly (.86) 

. sweet-sour (.83) 

. Clean-dirty (.82) 

. kind-cruel (.82) 

. pleasant-unpleasant (.82) 
. bitter-sweet (.80) 

. sacred-profane (.81) 

. nice-awful (.87) 

. fragrant-foul (.84) 

. honest-dishonest (.85) 

. fair-unfair (.83) 

. tasty-distasteful (.77) 

. valuable-worthless (.79) 
. happy-sad (.76) 


Note. 


ministered to 25 Ohio State University students who 
strongly favored and 25 who strongly opposed labor 
unions. The average age of the pro-labor group 
was 24.33 years and of the anti-labor group, 22.24 
years. Twenty-four of the pro-labor group and 21 
of the anti-labor group were males. At the time of 
the test three of the pro-labor group were union 
members but none of the anti-labor group. The 
pro-labor group recorded a total previous union 
membership of 35 years and two months. The anti- 
labor group recorded a total previous membership 
of six years and six months. 

A critical ratio of proportion technique was used 
to select the differentiating items. Of the 300 items, 
107 differentiated between the groups at the 5% 
level of confidence. No gradient differentiated when 
paired with either of the concepts equal pay and 
work quota. The number of neutral scores given 
these concepts and the lack of consistency suggested 
that they were measuring concepts other than the 
ones intended. The shortening of equal pay from 
equal pay for equal work may have changed the 
concept. These two concepts were dropped from the 
test. 


The Pretest 


The 12 gradients which differentiated between the 
two criterion groups with each of the remaining 
eight concepts at the highest levels of confidence 
were retained and combined into a new test. In 
addition, the gradient good-bad was paired with 
four concepts at the end of the test to make a test 
of 100 items. These items were combined randomly, 
except that no gradient was allowed to follow itself 
directly and a concept was separated from itself by 
three other concepts. Since six of the concepts were 
stated from the labor point of view and two from 
the management point of view, the direction of the 


16. ferocious-peaceful (.69) 
17. bright-dark (.69) 
18. healthy-sick (.69) 
19. fresh-stale (.68) 
20. brave—cowardly (.66 
21. black-white (.64) 
22. calm-agitated (.61) 
23. rich—poor (.60) 
24. clear-hazy (.59) 
. high-low (.59) 
26. empty-full (.57) 
27. relaxed—tense (.55) 
28. rough-smooth (.46) 
. near—far (.41) 
30. up-down 


Figures in parentheses after the gradients are the loadings on the evaluative factor in Osgood's factor analysis. 


continuum was not regular. These items were re- 
versed before tallying the responses 

The pretest was administered to two labor and 
two management groups which were considered to 
be criterion groups. The labor groups were 20 local- 
union officers assembled at The Ohio State Univer- 
sity at a labor institute, and 48 members of the 
state council of an international union. All of these 
Ss were elected officers. It was hypothesized that 
one of the reasons they had been elected to their 
offices was that they had internalized well the norms 
of their groups. The management groups were 33 
members of a local unit of an international service 
club, all of whom had expressed prejudice toward 
the management point of view, and 38 members of 
an industrial association in a large midwestern city. 

The test was administered to the first three of 


A - State Labor Council C + Service Club 
B - Local-Union Officers D - Industrial Associa- 
tion 


The Closed Shop 

Grievance 

Arbitration 

The Labor Movesent 

Working During o Strike . 

Labor in Politics 

Seniority 

Individual Bargaining . . . . 

2 3 7 5 6 7? 

Fic. 1. Profiles of two labor groups and two man- 
agement groups on the 100-item pretest. 
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these groups in group situations. The members of 
the industrial association received the test by mail 
from their executive secretary. The profiles of the 
four groups are shown in Fig. 1. 


Reliability of the Pretest 


Split-half reliability coefficients, corrected for length 
by the Spearman-Brown formula, were computed for 
labor (r= .96) and for management (r= .96). In 
addition, the product-moment correlation coefficient 
of the two labor groups (r= .93) was computed, 
and that of the two management groups (r= .85). 
This was the equivalent-group method of determin- 
ing reliability used and reported by Osgood and 
Stagner (12). The standard error of measurement 
for labor was .12 and for management .19 scale unit. 

The consistency of operation of the test may be 
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observed by inspecting the statistics in Table 3. 
These are the differences between the mean responses 
of the labor and management groups for each item. 
The items are grouped under the heading of the 
concepts with which they are paired. The consist- 
ency with which the differences for each concept 
hover within a narrow range may be seen on this 
table. For example, except for an equivocal item 
(No. 42) the differences between labor and manage- 
ment on all gradients paired with the concept the 
closed shop range from 3.2 scale units to 3.9 scale 
units, a range of only .7 scale unit. Two other 
equivocal items may be seen on Table 3: one 
matched with labor in politics and one matched 
with working during a strike. The first two of 
these were marked almost randomly and showed no 
significant difference between labor and management. 


Table 3 


Differences Between Labor and Management Mean Responses on Individual Items in 
Scale Units on the 100-Item Pretest 


The Closed Shop 
3.9 58. 
3.5 64. 
3.5 74. 

3 77. 
3.8 82. 
3.8 95. 

99. 





Grievance 
56. 
59. 
73. 
83. 
88. 
91. 
98. 





Arbitration 
63. 
71, 
79. 
84. 
87. 
90. 


Working During a Strike 
12. 3.7 41. 3.6 
19. 3.3 47. 3.1 
23. 3.0 57. 2.4 
29. 1.7 66. 3.2 
$2. .3.! 69. 3.7 
35. 4.0 94. 3.1 


Labor in Politics 
$s. 28 61. 
a. oe 75. 
ae. a 78. 
26. 3.2 85. 
36. 34 89. 
46. 3.1 92. 





Seniority 
20. 
25. 
30. 
33. 
38. 
44. 


1.5 
1.3 
1.5 
1.6 
1.5 
1.4 





The Labor Movement 
21. 
24. 








Individual Bargaining 
43 
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Validity of the Pretest 


Osgood has discussed the validity of the semantic 
differential at some length (8). Perhaps the evi- 
dence for validity in this research lies in the sta- 
tistics which underlie Fig. 1. The position of each 
group for each concept on this figure is the mean of 
the mean responses for all items matched with that 
concept. It may be noted that on every concept la- 
bor marked toward one end of the scale and man- 
agement marked away from labor toward the other 
end. The two groups maintained this direction even 
on the concepts seniority and arbitration, where 
management was consistently on labor’s end of the 
scale, and on grievance, where management was ap- 
proximately neutral. 

It might be expected that the members of the 
industrial association would have internalized the 
norms of management more completely than the 
members of the service club would. About 55% of 
the service club were engaged in business, but only 
two were engaged in enterprises large enough that 
the opposing norms of these two groups would be- 
come prominent in their experience. The remainder 
of the 55% were engaged in small, unorganized en- 
terprises such as jewelry stores, drug stores, a small 
children’s clothing store, etc. The remaining 45% of 
the club was composed of professional men: doctors, 
lawyers, judges, teachers, dentists, etc. On the other 
hand, the industrial association was composed of 
men who were rather closely engaged in this labor- 
management problem. About two-thirds of them 
were presidents, vice-presidents, or plant managers 
of large industries. About one-third, including some 
of these, were industrial relations managers in name 
or in fact. One would expect them to have internal- 
ized better the norms of the management group and 
to mark the items on the more extreme positions. 
It may be seen in Fig. 1 that, although most of the 
differences are not significant, the industrial associa- 
tion was more extreme on every concept, and the 
lines never cross. 

About the same judgment may be made of the 
two labor groups. One would expect the state labor 
council to have internalized better the norms of the 
labor group, since adherence to group norms con- 
tributes to the popularity and leadership qualities of 
a group member. The positions of the local-union 
officers are considerably below those of the state la- 
bor council in the union hierarchy. Inspection of 
Fig. 1 shows that the two groups marked the items 
as expected. The state labor council was more ex- 
treme on every concept, and the lines do not cross 
at any point. 


Results 


The mean of the item means for the 67 la- 
bor Ss was 2.2 and for the 71 management 
Ss, 4.6. 

The significance of the differences between 
the means of these two groups on individual 


items (Table 3) was computed by means of 
the ¢ test. Lack of homogeneity of variance 
among the item distributions was demon- 
strated by significant values for F. Conse- 
quently, a formula for ¢ which does not make 
the assumption of a normal population (and 
homogeneity of variance) was used to test 
the significance of the differences between the 
two groups (3). Ninety-seven of the 100 
items gave values for ¢ which were statisti- 
cally significant at the .1% level of confi- 
dence. The smallest value for ¢ among these 
97 items was 4.12. 

Inspection of Fig. 1 will show the relatively 
greater polarization of the labor group as op- 
posed to that of the management group. The 
approximately neutral position of manage- 
ment (4.6) seemed to have been caused by 
the strong trend toward the labor end of the 
continuum on seniority, by the milder trend 
in the same direction on arbitration, and by 
the neutral position on grievance. On the 
other five concepts, management assumed a 
position on the continuum opposite the end 
which labor chose, but used the extreme po- 
sition on the scale less often than the mem- 
bers of the labor group did. 

The items were evaluated in terms of Os- 
good’s D value. As used by Osgood (7), the 
D value was the scale distance between the 
neutral point on the scale and the mean re- 
sponse of the group on an item. Osgood con- 
sidered an item to have a satisfactory D value 
only if the distance were 1.5 scale units; in 
the present comparison of two opposing 
groups, the D value would have to be 1.5 
in both directions. Computation of the D 
values showed that a 20-item test could be 
constructed according to Osgood’s standard, 
but would include only three concepts and 
one of them would be measured by only one 
gradient. Reducing the D value from 1.5 to 
1.0 would make it possible to include more 
concepts but the number of gradients would 
be inadequate on all but one or two of them. 
This apparent incongruity between the ¢ 
values and the D values was caused by man- 
agement’s tendency to mark on labor’s end 
of the scale or within 1.5 scale units of the 
neutral point. Labor polarized 1.5 scale 
units or more on 85 of the 100 items, but 
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this was true of management on only 22 
items. 

An accompaniment of management’s tend- 
ency to polarize less well than labor was a 
tendency for the members of the management 
group to agree less well with each other. The 
mean standard deviation on the individual 
items was 1.525 for the labor group and 1.612 
for the management group, a difference of 
.087 scale unit, significant at the 5% level of 
confidence (¢ = 2.06). 


The 40-item Test 


Since almost all of the items produced 
highly significant differences between labor 
and management mean responses, the test 
was shortened by selecting for each concept 
the five gradients which produced the great- 
est scale differences. The means of the mean 
responses on these items were computed for 
each concept. Fig. 2 compares the profiles 
of the two groups on this 40-item test and on 
the 100-item pretest. 

Split-half reliabilities and standard errors 
of measurement were computed for this 40- 
item test as on the pretest. They are listed 
in Table 4. Thus, although the reliability 
coefficients for this test were not low, they 
were considerably lower than those secured 
on the pretest. The standard error of meas- 
urement was satisfactory for the labor groups 
but was much higher for the management 


Labor Management 


A+ 4O-item test Cc - 


4O-item pretest 
B - 100-item test 


D - 100-item pretest 
The Closed Shop ° a . . > 
Grievance 

Arbitration 

The Labor Movement 
Working During a Strike . 
Labor in Politics 
Seniority 

Individual Bargaining : 
Fic. 2. Profiles of labor and management on the 


100-item pretest compared with their profiles on the 
40-item test. 


Table 4 


Reliability Coefficients and Standard Errors of 
Measurement on the 40-Item Test 








Standard 
Error of 
Measurement 


Reliability 
Group Coefficient 





Local-union officers 87 18 
State labor council 89 13 


Total labor 89 13 


Service club ae 54 
Industrial association 82 42 


Total management 80 40 


groups than on the pretest. It became ap- 
parent in later computations that most of 
this effect was caused by responses on the 
three concepts grievance, arbitration, and 
seniority, on which management was either 
neutral or marked in the direction of labor. 


The 25-item Test 


Since, although results were not significantly 
different when the length of the test was re- 
duced to 40 items, the reliability was lowered 
and the error increased, the test was changed 
by dropping the three concepts on which man- 
agement did not polarize, leaving a test of 25 
items. This test consisted of the five remain- 
ing concepts, each paired with the five gradi- 
ents which showed the greatest scale differ- 
ences between labor and management. 


Table 5 


Reliability Coefficients and Standard Errors of 
Measurement on the 25-Item Test 





Standard 
Error of 
Measurement 


Reliability 
Group Coefficient 


Labor-union officers 89 16 
State labor council .96 10 


Total labor 95 Al 


Service club 91 10 
Industrial association .92 16 


Total management .92 12 
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Table 6 
Mean Standard Deviations of the Item Distributions 





Mean Deviation 





Labor Management Ditference 


Item Group Significance 





1.612 .087 5% 
1.566 063 
1.542 062 
1.603 270 


100-item pretest 

40-item test 

25-item test 

15-items (seniority, grievance, 


1.525 
1.503 
1,604 


None 
None 
oF 

. 1 /0 


arbitration) 


The reliability coefficients and standard 
errors of measurement were computed for this 
test as for the others. They may be seen in 
Table 5. Comparison of these statistics with 
those in Table 4 shows that when the con- 
cepts grievance, arbitration, and _ seniority 
were dropped from the test, the great differ- 
ence between management and labor in the 
standard error of measurement disappeared. 
The error became .11 scale interval for labor 
and .12 for management. In terms of confi- 
dence limits this error was about one-third of 
a scale interval at the 1% level of confidence. 
The improvement in reliability was noticeable 
also, although the test was reduced in length. 

The advantage of the 25-item test over the 
40-item test seemed thus to lie in its higher 
statistical reliability and lower standard error. 
On the other hand, these advantages could 
have been the result of some factor other than 
the test itself, e.g., a changing concept, which 
Stagner and Osgood believed to have a greater 
average deviation than a fixed stereotype 
(12). This hypothesis was given some cre- 
dence when the mean standard deviations 
were computed for both of these tests. These 
statistics are given in Table 6, along with the 
same statistics for the 100-item pretest and 
for the 15 items paired with the three con- 
cepts on which management did not assume 
its expected position. It is apparent from 
Table 6 that many of the items on which 
management spread its responses more widely 
over the scale, as indicated by significance of 
the difference between labor and management 
standard deviations on the pretest, were con- 
centrated in the 15 items which were paired 
with these three concepts. Management here 
showed standard deviations which differed 


from labor’s at the .1% level of confidence. 
In the light of Stagner and Osgood’s findings, 
it was considered possible that the stereotypes 
tapped by these symbols were changing in the 
management group at the time the test was 
administered. If so, the apparent unreliabil- 
ity of the measuring instrument in these sub- 
areas may have been the unreliability of the 
concept being measured. 

Consequently, the 40-item test was ar- 
ranged with the expectation that the 25-item 
test could be scored out of it for further com- 
parisons. This test will be validated upon 
several subgroups in industry (e.g., line fore- 
men and clerical workers) and reported at a 
later time. 

Discussion 

Bakke’s conclusions after his interviews 
with 60 labor leaders and 60 business execu- 
tives were not completely supported by this 
study; nor were Chamberlain’s and Reynolds’ 
descriptions of management's antipathy to- 
ward all of the devices which labor has in- 
vented to restrict the discretion of manage- 
ment in running the business. From the re- 
ports of these and other writers, and from the 
author’s interviews with management, one 
would conclude that management holds so- 
cial stereotypes as extreme as those held by 
labor. 

It seems possible that writers in this field 
have painted a picture of the theoretical po- 
sition of the “good” union member who has 
internalized perfectly the presumed norms of 
his group, and of the “good” member of the 
management group. The results of this study 
suggest that the picture is further from the 
truth in the case of management than in the 
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case of labor. The member of the manage- 
ment group is not nearly so extreme as he is 
generally believed to be nor as he believes 
himself to be. Since the Ss in this study 
were selected because they were believed to 
be criterion groups, this does not seem to be 
an overstatement. 

These results suggest that the semantic 
barrier to communication is greater in the la- 
bor group than in management. Although 
the semantic differential as used here does not 
measure intensity of attitude, the more ex- 
treme positions of labor on the scale and the 
greater agreement among members of the la- 
bor group indicate a more restrictive opera- 
tion of the frame of reference in labor than 
in management. As a semantic problem in 
the kind of communication in which manage- 
ment tries to tell its side of the story to la- 
bor, this is important. 

Perhaps another aspect of the validity of 
the semantic differential should be consid- 
ered here. The Ss in this study were reacting 
to a symbol when they circled a number on 
the gradient-scale. The conclusions of the 
study were based on the inference that a con- 
cept was being measured. It is possible that 
the measurement was linguistic, not concep- 
tual, and that other symbols would have 
evoked other responses and other semantic 
distances. Thus, the conclusion drawn above 
that labor provided a greater share of the 
semantic barrier than management may apply 
only to this situation, with these Ss, and with 
these symbols. 


Conclusions 


The following conclusions were drawn: 

1. Management’s frame of reference was 
significantly different from labor’s. Appar- 
ently, management has a story to tell which 
is significantly different from labor’s story. 

2. Management’s frame of reference was 
not well understood by writers on the sub- 
ject nor by management Ss used in the study. 
The management group revealed meanings 
for some concepts which were more nearly 
like labor’s than its own members seemed to 
believe. 

3. There were semantic barriers between 
the labor and management groups used in 


this study. The concepts evoked by these 
symbols in the labor listener are apparently 
not always the ones intended by the manage- 
ment communicator. 

4. The labor group stereotyped more than 
the management group and the stereotypes 
were more extreme. Members of the man- 
agement group agreed with each other less 
well and held less extreme frames of refer- 
ence than members of the labor group. Thus, 
the semantic distance seemed to have resulted 
more from labor’s frame of reference than 
from that of management. 

5. Management seemed, to be leaving its 
traditional position on some of these con- 
cepts and moving in the direction of labor’s 
position. 

6. The frames of reference of these two 
groups can be measured with the semantic 
differential and the strength of the semantic 
barrier quantified. 


+ Summary 


A semantic barrier to communication be- 
tween labor and management was quantified 
by establishing the frames of reference of la- 
bor and management criterion groups on the 
semantic differential, using concepts selected 
from the area of labor-management relations. 
Significant semantic distance between the two 
groups was revealed. Labor stereotyped more 
than management, and assumed more extreme 
scale positions. Thus, the semantic distance 
seemed to have resulted more from labor’s po- 
sition than from management’s. The greater 
standard deviations of the responses of the 
management group on three concepts sug- 
gested that management’s position on these 
concepts was changing. 
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Controlled Association Scores and Engineering Success 
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The data to be presented here were ac- 
quired during the course of research on en- 
gineering graduate placement tests conducted 
by Educational Testing Service.t A con- 
trolled association test was included in a 
large battery of tests taken by experienced 
engineers. The test asked the Ss to write as 
many synonyms as possible in a limited time 
for each of eight common words. The ex- 


aminees were told that they were taking a 
test of their ability to think of words that are 
related to a key word. Twelve minutes were 
allowed to complete the test. 


The Scores 


Six scores were developed from this test. 
One score was a straightforward count of the 
total number of words or phrases given as 
responses. In terms of the previous factor 
analyses of scores of this type (1, 2, 3, 4, 5, 
8, 9), it was presumed that this score would 
reflect a composite of Originality and Associa- 
tional Fluency and would be related to super- 
visors’ ratings of success on the job. 

In an attempt to separate the Originality 
and Associational Fluency aspects of the 
score, it was decided to score the test for 
common and uncommon responses (10). By 
pretesting engineering students,? data were 
obtained which made it possible to tabulate 
the frequency with which various words were 
given as responses to each of the stimuli. 
Since any definition of “common” would be 
arbitrary, a number of such definitions were 
explored. Each of them is based on the same 
procedure, that of defining “common” words 
as the smallest set of words which would ac- 


1 This research, supervised by David R. Saunders, 
was supported financially by and was based on the 
cooperation of five companies: American Telephone 
and Telegraph Company, Detroit Edison Company, 
B. F. Goodrich Company, International Business 
Machines Corporation, and Westinghouse Electric 
Corporation. 

2 The cooperation of Princeton University and the 
Westinghouse Education Center in the pretesting is 
greatly appreciated. 


count for a specified percentage of all re- 
sponses given to each stimulus word by the 
pretest groups. Four sets of common words 
were examined, based on percentages of 20, 
35, 50, and 70. These scores are presumed 
to emphasize Associational Fluency at the ex- 
pense of Originality. 

The sixth score was a count of every word 
or phrase given as a response but not included 
in the “common” response key based on 50%. 
This score was considered to represent “un- 
common” responses, the evidence provided by 
Guilford and his collaborators suggesting to 
the writer that the ability to call to mind un- 
common or farfetched, but synonymous, words 
is an aspect of originality, similar to the un- 
commonness-of-response score on Guilford’s 
test of Number Associations (10, pp. 365, 
368). 


The Subjects 


The test battery was taken by 687 em- 
ployed, experienced engineers in five com- 
panies. By a method described elsewhere 
(6), the Ss were grouped according to the 
type of engineering work (the function) they 
were performing. There were six of these 
functional groups of engineers: I. Research, 
II. Development, ITI. Application, IV. Opera- 
tions, V. Supervision, and VI. Sales. 

Supervisors’ ratings of job success were ob- 
tained. Some were abstracted from records 
maintained by the companies; others were ob- 
tained specifically for this study. All ratings 
were converted to rank orders within work 
groups of engineers in each company, and 
these were converted to percentile ranks in 
order to account for different group sizes. 

The 687 Ss were divided into analysis and 
cross-validation subgroups of 400 and 287 
cases, respectively. Group A (the analysis 
group) was composed of 50 cases rated high 
on job success and 25 rated low from each of 
the functional groups except III, Applications. 
For that group, 25 cases covering the entire 
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Table 1 


Ratings of 
Supervisors 


Develop- 


Research ment 


High 51 
Mixed 
Low : 25 


Total 


range of job success ratings were used, be- 
cause the number of Ss in this group was 
rather small. The remaining cases were as- 
signed to the cross-validation group (Group 
B), the cases being distributed as shown in 
Table 1. 


Analysis of the Scores 


In analyzing the data from Group A, it 
was assumed for the purpose of analysis of 
variance that company differences and inter- 
actions involving companies were negligible. 
To obtain a preliminary indication of whether 
the Controlled Association ‘Test scores were 
associated with either the job-placement or 
the success criterion (or their interaction), 
over-all F ratios were computed treating the 
11 subgroups of Ss in Group A as a one-way 
classification. For only one of the scores, the 
score based on uncommon responses, was the 
F ratio not significant at the .05 level. No 
further use was made of this score in this 
study. It appears that Originality, if that is 
what this score measured, has little to do 
with either placement or success for these Ss. 

To determine the relationships involving 
the other scores treating the two criteria sepa- 
rately, two types of analysis of variance were 
used. First, for each score a one-way classifi- 
cation analysis was computed treating each 
of the six functional groups as a level. None 
of these F ratios was significant. The null 
hypothesis that the scores are unrelated to 
differences in functional groups was not re- 
jected. Second, Functional Group III was 
omitted (it being a single, heterogeneous 
group), and two-way classification analyses 
were computed, treating functional groups as 
one classification and the two levels of job 


Appli- 
cation 


Distribution of 287 Cases in Group B 


Opera 
tions 


46 
23 


25 


71 


Table 2 


Validities of Five Scores for Group A 


Keys 


0% 35% 30% 


70% 
Score reliability 12 3 54 
Validity ll 13 Al 
Corrected validity —.33 21 2 


success as the other. For all five scores, the 
interaction F ratios were insignificant. How- 
ever, the F ratios for job success for all five 
scores were statistically significant. 

The odd-even item reliability coefficients 
for each of these five scores for the Ss in 
Group A, the correlations between test scores 
and ratings of success, and the correlations 
corrected for unreliability of test scores ap- 
pear in Table 2. 

The correlations with success ratings, the 
test score reliabilities, and the correlations 
corrected for unreliability of the test scores, 
for Group B, the cross-validation sample, ap- 
pear in Table 3. 

On the basis of these data it appears to the 
writer that a perfectly reliable score based on 


Table 3 


Validities of Five Scores for Group B 


20% 35% 30% 70% 100% 


Score reliability 34 52 AS 63 80 
Validity 09 08 .06 .06 03 
Corrected validity 15 Al 09 07 04 
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a system like that of the 20% key might 
have a validity of about .20 for this criterion 
with Ss like these. While this is not spec- 
tacularly high as a validity coefficient, it is 
higher thar any others found so far in ETS 
research on engineering graduate placement 
tests, including a large number (over 50) of 
scores from varied measures of abilities, tem- 
perament . traits, motivation, and interests. 
Further indication of the possible value of a 
score of this type is derived from the finding 
that in Functional Groups I, IT, III, and IV, 
the common response score contributed sig- 
nificantly to the multiple correlations between 
job success and test scores (7). This indi- 
cates that the common response score, which 
was assumed to measure Associational Flu- 
ency, taps variance which is not better meas- 
ured by other tests that have been tried so 
far. 

One must not overlook a salient feature of 
the cross-validation set of data. None of the 
uncorrected validity coefficients on this set of 
data were statistically significantly different 
from zero at the 5% level. However, neither 
are any of them significantly different from 
their counterparts of Group A. Since there 
was no reason to predict shrinkage here, 
Groups A and B were compared in several 
ways to see if a cause for the lower validities 
could be found. No difference between Group 
A and B was found which would explain the 
shrinkage. The score range, mean, and stand- 
ard deviation for each key were very similar 
in both groups, as can be seen in Table 4. 
The range, mean, and standard deviation of 
the success ratings also were very similar, as 
can be seen in Table 5. 


Table 4 


Range, Mean, and Standard Deviation of Scores 
for Five Keys on Groups A and B 





Range M 





A B A B A B 





6.88 6.84 1.99 
11.32 11.29 3.01 
15.96 15.71 4.28 
22.90 22.57 6.22 
41.18 40.51 12.31 


1.96 
3.00 
4.03 
5.91 
12.40 


1-13 1-12 
2-21 2-18 
3-28 4-27 
6-38 6-40 
12-92 9-89 


Table 5 


Range, Mean, and Standard Deviation of Success 
Ratings for Groups A and B 





Range M 


4.31 
4.40 


Group A 1-8 
Group B 1-8 


Summary 


Six different scores from a controlled asso- 
ciation test have been studied, using a large 
sample of experienced engineers as examinees, 
and using criteria of job success and job 
placement. The test asks Ss to write, in a 
limited time, as many synonyms as they can 
to eight common words. None of the six 
scores from this test appears to be related to 
job placement, but the five of them which 
are based on the number of common responses 
given seem to be related to job success. These 
five scores vary along a continuum of com- 
monness of response words, from a very strin- 
gent definition of commonness to a definition 
which includes all responses given. Although 
most of the validity coefficients computed in 
this study are well below correlations of .20, 
it is estimated that by using a revised test 
format a validity approaching .20 could be 
obtained in a similar testing situation. Al- 
though not spectacularly high, such a validity 
coefficient for ratings of success is promising 
in comparison with over 50 other variables 
which were studied in ETS research on engi- 
neering graduate placement tests. The most 
promising score on the Controlled Association 
Test is the one based on the most stringent 
definition of commonness. A score based on 
number of uncommon responses was not sig- 
nificantly related to the criteria. 
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In a previous paper, the writer has de- 
scribed the construction of two 24-item ques- 
tionnaires for the measurement of neuroticism 
and extraversion (1). The studies described 
in this paper were based on item analyses of 
some 250 questions appearing in well-known 
inventories, as well as a factor analysis of 
the finally chosen 48 questions, carried out 
separately for 200 men and 200 women. The 
reliabilities of the new questionnaires were 
reasonably high, in spite of their relative 
shortness, being .88 for neuroticism and .83 
for extraversion. The independence of the 
two scales was demonstrated by the low cor- 
relation of — .09 for the original sample of 
400 men and women, and the even lower cor- 
relation of — .07 for a further male group of 
200. Factorially, too, the items chosen for 
the two scales fell into two clearly separated 
groups, making rotation to simple structure 
easy. A limited number of validation studies 
have been carried out, and are quoted in The 
Dynamics of Anxiety and Hysteria (2). 

For many practical purposes, such as work 
in market research, for instance, even a rela- 
tively short questionnaire containing 48 ques- 
tions may be too long, and the present study 
was designed to investigate the possibility of 
using an even shorter version containing only 
6 questions for each of the two scales. 


Subjects and Method 


The subjects of the investigation were approached 
on a quota sample basis by the interviewers of one 
of the largest and most experienced British Market 
Research organizations; these interviews are carried 
out all over England, correct proportions of urban 
and rural dwellers, and of the different regions of 
the country being ensured. In addition to sex, the 
sample was divided according to age, 35 being the 
dividing line. Social class was assessed in the usual 
manner, the dividing line being taken between classes 
A, B, and C on the one side, and D and E on the 
other. 

The total sample consisted of 1,600 subjects, di- 
vided into 8 groups of 200 each on the basis of the 
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three selection criteria taken in all possible com- 
binations. The reliability of sex and age classifica- 
tions is known to be reasonably high; that for class 
is rather lower (3). We may expect these unreli- 
abilities to lead to a varying degree of attenuation 
in our results. 

In the interview, a number of questions were first 
asked relating to a variety of commercial products; 
these constituted the ostensible purpose of the inter- 
view. A few personal questions about age and oc- 
cupation followed, and finally the interview was 
terminated with the 12-item personality question- 
naire given below. The questions were asked by the 
interviewer, and the answers written down by him. 
The proportion of subjects approached who refused 
outright was 7%; the proportion of subjects who 
consented to answer the questions in the first part 
of the interview and refused to answer the ques- 
tions in the personality inventory was only 2%. 

The questions used in the study are given in 
Table 1. Each question answered “Yes” was scored 
plus one point for Neuroticism (marked “N” in the 
key) or Extraversion (marked “E” in the key); 
each question answered “No” was scored minus one 
point for Neuroticism or Extraversion, respectively, 
as shown in the key. No points were given for an- 
swers which could not be clearly classified as either 
“Yes” or “No” by the interviewer. The possible 
range of scores on either factor is therefore from 
plus six points to minus six points, a total of twelve 
points. 


Results 


Tetrachoric correlations were run between 
the twelve questions, and the resulting table 
of correlations factor analyzed. Thurstone’s 
procedure was followed, and the two highly 
significant factors emerging were rotated in 
accordance with the principle of simple struc- 
ture (4). Table 2 gives the factor loadings 
of the rotated factors. Also given in Table 2 
are the loadings of the 12 items which they 
had originally had in the analyses carried out 
on the whole population of 200 men and 200 
women for all 48 items (1). The compari- 
son shows that the figures are remarkably 
similar from one occasion to the other, al- 
though methods of selection have changed 
considerably, and although in the original 
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Table 1 








Questions 


vai 
e 
< 


. Do you sometimes feel happy, sometimes depressed, without any apparent reason? 
. Do you prefer action to planning for action? 
. Do you have frequent ups and downs in mood, either with or without apparent cause? 
. Are you happiest when you get involved in some project that calls for rapid action? 
». Are you inclined to be moody? 
*, Does your mind often wander while you are trying to concentrate? 
. Do you usually take the initiative in making new friends? 
. Are you inclined to be quick and sure in your actions? 
Are you frequently “lost in thought” even when supposed to be taking part in a conversation? 
Would you rate yourself as a lively individual? 
. Are you sometimes bubbling over with energy and sometimes very sluggish ? 
. Would you be very unhappy if you were prevented from making numerous social contacts? 


MASAMAmmMAmA™*ymM As 


analyses the 12 items were only a small part Table 3 
of the total number of items factor analyzed. 
In some ways the new set of factor loadings 
is even more clear-cut than the original one. mnie 
None of the E items has loadings on N as Source of Variance Squares df 
large as .10, and none of the N items has...) ssuenseee 
loadings on E as large as .10; in the original 
study several loadings exceeded this figure. mee nese onaaenee 
We may conclude, then, that the factor struc- Class 311.5225 311.5225* 
ture has stood up well to repetition. Age 142.8025 142.8028° 

The correlation between Extraversion and — st order interactions 
Neuroticisrn is — .05; this is very similar to — — 
the correlations reported previously for our Class: Age 1.3225 
samples of men and women. Again, there- 
fore, the figures from the present study bear ,.,,.), 
out in an important direction the conclusions Alll interactions 52.7700 4 = 13,1925 
from the original work. The split-half reli- All differences between ‘ r 

onsae groups 1502.4975 7 214.6425 
abilities (corrected) are .79 for N and .71 for 
E; these values are acceptable for group com- 


Analysis of Variance of the ‘‘Neuroticism’’ Scores 


Main effects 


Second order interactions 14.8225 


Residual variance within groups 18767.2800 1592 11.7885 


* Signifies statistical significance at 5% level. 


Table 2 
—— SSS parisons. (Test-retest reliabilities on small 
Present Sample Original Sample groups have been found to be slightly, but 


Nr not significantly, in excess of these figures.) 
po Results of an analysis of variance for Neu- 
o3 roticism and Extraversion scores respectively 
are reported in Tables 3 and 4. Significant 
differences due to some of the main effects ap- 
pear in the scores for both factors, but they 
are more conspicuous on the N scores, where 
they account for 7.41% of the total variance, 
than on the E scores, where they only account 
for 0.94%. The sex difference is the greatest 
in relation to N and the only significant one 
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Table 4 


Analysis of Variance of the ‘‘Extraversion” Score 


Sum of 


Source of Variance Squares 


Total 14263,8975 
Main effects 

Sex 

Class 

Age 
First order interactions 

Sex: Class 

Sex: Age 

Class: Age 


Second order interaction 


Totals 


All interactions 
All differences between 
groups 


9.5350 4 2.3838 


134.1775 7 19,1682 


Residual variance within groups 14129,7200 1592 8.8755 


* Signifies statistical significance at 5% level. 


in relation to E. On N, the women have a 
score roughly 4 SD higher than the men (i.e., 
women are less stable); on E, the men have 
a score roughly 4 SD higher than the women 
(i.e., men are more extraverted). Class and 
age differences are also significant for N, the 
lower class and younger age groups being 
slightly more unstable emotionally by | SD 
and } SD, respectively. None of the interac- 
tions give rise to mean square variances sig- 
nificantly greater than the residual error; on 
the whole they tend to be small. In fact, 
most of the observed differences are slight 
and only significant because of the large num- 
ber of cases; little psychological importance 
would appear to attach to any of them except 
the sex difference on N, which is large and in 
line with previous work (1). 

The mean scores for N and E, respectively, 
are .15 and 1.96 for our sample; corrections 
for different proportions in the total popula- 
tions would not give appreciatively different 
estimates of population parameters, and would 
appear to be a task of supererogation. Dis- 
tributions of scores are sufficiently normal to 


of the previous analysis. 


permit the use of correlational statistics,’ and 
the variances of the different groups are suffi- 
ciently homogeneous to permit analysis of 
variance to be carried out without transfor- 
mation. The variances for N are slightly 
higher than those for E, being 11.73 as com- 
pared with 8.83. 

A question regarding drinking habits was 
included in the questionnaire. A division was 
made between “drinkers,” i.e., those who 
drank frequently or sometimes, and “non- 
drinkers,” i.e., those who drank very rarely 
or never. The N scores of these two groups 
were very similar, being — .37 as compared 
with .04; if anything, it appears that “non- 
drinkers” as here defined are very slightly 
more unstable than drinkers. The small size 
of the difference does not warrant our taking 
this conclusion too seriously. The E scores of 
the two groups are very significantly differ- 
ent, the scores being 2.48 and 1.55. Thus 
drinkers are about 4 SD more extraverted 
than nondrinkers. 


Summary 


An investigation has been carried out to 
demonstrate the possibility of constructing 
short reliable personality questionnaires which 
might be of use in industrial and applied 
work, and which could be administered in the 
usual interview situation. 

An analytic sample of 1,600 adult subjects, 
equally divided as to age, sex and social class, 
was selected on a quota-sampling basis and 
administered a 12-item questionnaire. Six 
questions bearing on neurotism and 6 ques- 
tions bearing on extraversion had been se- 
lected from a previous item-analytic and fac- 
tor-analytic study in order to cross-validate 
certain conclusions. Correlations were calcu- 
lated between the 12 items, and a factor 
analysis performed; this disclosed two or- 
thogonal factors clearly identical with those 
Analysis of vari- 

1 The distribution of the E scores has a noticeable 
negative skew, but it is doubtful if this is sufficient 


to make desirable the use of logarithmic or other 
types of transformation. 
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ance gave evidence of certain score differences 
due to sex, age, and social class, although 
with the exception of the sex differences these 
were of minor importance. The 12-item ques- 
_ tionnaire was found to have reasonable reli- 
ability, and the two personality variables 
measured by it were found to be uncorrelated. 
The practical usefulness of instruments of 
this kind was discussed. 
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Prehension Force as a Measure of Psychomotor Skill for 
Bare and Gloved Hands *’ 


John Lyman and Hilde Groth 
University of California, Los Angeles 


Data regarding thumb-fingertip grasp forces 
and the variables affecting these forces dur- 
ing manipulation are of both theoretical and 
practical importance to the psychology of 
motor skills. For theory, they may aid in 
defining the role of sensory feedback loops in 
manual activities. For application, such data 
are of potential value to the design of termi- 
nal devices for artificial arms, to the design of 
protective hand coverings and to the design 
of perceptual-motor tasks by industrial en- 
gineers. 

Static measurements of maximum grasp 
force with the thumb opposing the index and 
middle fingers have been reported by Inman 
and Eberhardt for eight Ss as a function of 
arm-hand angle and the distance between the 
thumb and the opposing fingers (4). Their 
results, which were not tested for statistical 
reliability, suggest that distance between the 
thumb and fingers over the range from one- 
half to three inches was not an important 
variable, but that arm-hand angle was, with 
a mean maximum grasp force of approxi- 
mately 17.5 pounds at angle of 145°. No dy- 
namic measurements of normal finger-thumb 
grasp forces are known to us. 

In view of the sparse treatment of this topic 
in the literature about this aspect of manipu- 
lative skill, it was felt that a challenging 
problem in methodology existed for making 
the measurements, and that the dynamic 
measurement of prehension forces might prove 
to be a valuable index to perceptual aspects 
of motor skill which measurements of move- 
ment time and patterns could not take into 
account. Accordingly, on a psychomotor task 


1 This investigation was supported by QM Con- 
tract No. DA 44-109-9M-1531 between the U. S. 
Army Q.M. Corps and the University of California, 
Los Angeles. The opinions expressed are those of 
the writers and do not necessarily reflect those of 
the contracting agency. 

2Some of the experimental results were presented 
by one of the authors at the APA Convention in San 
Francisco, September, 1955. 
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requiring discrete movements, observations 
were made on four variables that might be 
presumed to affect prehension force. These 
variables were, weight of the object, distance 
moved, direction moved, and the effect of 
protective handcovering. 


Method 


Apparatus. The work space which is illustrated in 
Fig. 1 consisted of a semi-circular piece of one-half 
inch plyboard, four feet in diameter with one and 
five-eighths inch holes in it. These holes were equally 
spaced on radii from a central hole at 0°, 30°, 60°, 
90°, 120°, 150°, and 180°. Each direction was in- 
dicated by decal letters from A to G mounted on the 
work board, starting at the S’s left. The work table 
was 30 inches high. 

The cylindrical object which the S manipulated 
was made of lucite rendered opaque by means of 
black paint. It was hollow, and different weights 
could be inserted into it. Prehension force was 
measured by means of a special variable capacitor 
force transducer which is known as the Franklin In- 
stitute Laboratories Pressure Indicating Patch, or 
“filpip” (1). The device was calibrated by means 


Fic. 1. General view of work space and apparatus. 
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of weights hung from it in a calibration jig designed 
for the purpose. Forces during experimental runs 
were recorded continuously on a direct writing oscil- 
lograph. Since the measurement system remained 
relatively stable, checks on the calibration were made 
only before each S§ started his series and after he had 
completed it. 

Experimental design. The effects on prehension 
force of the following independent variables were in- 
vestigated in a treatment by Ss factorial design: 

1. Weight, consisting of five levels; 18.1 gms., 45.4 
gms., 118.0 gms., 308.5 gms., and 426.4 gms. 

2. Direction, consisting of seven levels: 0°, 
60°, 90°, 120°, 150°, and 180°. 

3. Distance, consisting of three levels: 9.0, 30.8, 
and 52.6 cm. as measured from the center hole on 
the board. This corresponded to the first, middle, 
and outermost holes in the board at each angle. 
The locations were clearly marked by means of paper 
indicators at the bottom of each hole as distances 
one, two, and three respectively. 

4. Hand coverings, consisting of the bare hand as 
a control, latex surgeons’ gloves for a light hand 
covering, and five finger leather Army gloves with 
75% wool, 25% nylon knitted liners for the heavy 
hand covering conditions. No attempt was made to 
quantify the concept of light and heavy hand cover- 
ing in terms of specific properties of the gloves 

Subjects. The experimental Ss, solicited from a 
classroom, were six male sophomore engineering stu- 
dents. All Ss were righthanded and ranged in age 
from 18 to 28 years. 

Procedure. Each S was instructed to grasp the 
test cylinder with his thumb and first two fingers, 
pick it up from the center hole, place it in the lo- 
cation designated by the E£, release it, then regrasp 
it and replace it in the center hole. He was told 
that the device contained a sensitive measuring in- 
strument and that he was to work at whatever he 
considered a comfortable speed. At no time was it 
suggested that force was being measured. A _ post- 
experiment interview indicated that most of the Ss 
thought performance time was the criterion measure. 
No S mentioned force as a possible measure. The 
bare hand, light hand covering, and heavy hand 
covering conditions were given to each S in a dif- 
ferent order so that all six possible orders were rep- 
resented by the six Ss. The sequence of weight, dis- 
tance, and direction combinations was determined 
by a combined random number and card sorting 
technique for each S. From these individual pro- 
grams, the E called out discrete commands consist- 
ing of the alpha-numeric code designating direction 
and distance. The S was allowed approximately six 
practice trials before proceeding with the 315 ex- 
perimental trials. Once the experiment was started, 
stops were made only to change the hand cover- 
ings and to insert different weights into the hollow 
cylinder. 


30°, 


Results * and Discussion 


Typically, on grasping the cylinder, a sharp 
peak of force occurred, followed by a rapid 
adjustment to a lower level during transport. 
The criterion measure used for the present 
analysis was the maximum force on the re- 
grasp part of the cycle, that is, the peak force 
used by the S when he picked the cylinder up 
to return it to the center hole. It was read 


to the nearest millimeter, and converted to 
grams from a calibration chart. 

A typical section of a record obtained for 
one of the Ss is shown in Fig. 2. 


Fic. 2. Typical record of prehension force meas- 
urements. (Letter designates direction; number 
designates distance.) 


An analysis of variance was made of the 
data in which the main effects were tested 
against the simple interactions between the 
respective main effect and Ss. The first or- 
der interactions were tested against the second 
order interactions involving Ss and the same’ 
procedure was employed for testing the sig- 
nificance of the higher order interactions. 
The significance level was set at p < .001 for 
all tests to take care of the fact that the 
standard deviations were rather large rela- 
tive to the means (2). All main effects ex- 
cept direction of movement were significant. 
The triple interaction between handcoverings, 
object weight, and Ss was also significant, in- 
dicating a need for additional investigation of 
these variables. 

Figure 3 is a plot of the mean values for 
the main effects found to be statistically sig- 
nificant. Under the conditions of this experi- 
ment, it is apparent that object weight is the 


8 Tables of the analysis of variance and the mean 
values and standard deviations of the main effects 
have been deposited with the American Documenta- 
tion Institute. Order Document No. 5428 from ADI 
Publications Project, Photoduplication Service, Li- 
brary of Congress, Washington 25, D. C., remitting 
in advance $1.25 for microfilm, or $1.25 for photo- 
copies. Make checks payable to Chief, Photodupli- 
cation Service, Library of Congress. 
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most extensive contributor to the stimulus 
complex determining prehgénsion force. This 
result was expected in conformity with com- 
mon experience. 

Distance has the least effect, ranging from 
576 to 794 grams over the conditions of the 
experiment. The effect of distance is tenta- 
tively attributed to overcompensation for the 
increased muscle tension required as the arm 
is extended. 

It is of interest to note that even light 
surgeon’s gloves appear to affect prehension 
force. This leads us to ;propose that the 
modus operandi for the effect of handcover- 
ings on finger manipulation is to distort tactile 
cues. It may be speculated that this distor- 
tion takes the form of lowered tactile sensi- 
tivity and false cues from nonlinear transmis- 
sion of information from the surface of the 
handcovering. As indicated by another study 
in this laboratory, the precise nature of the 
effect appears to depend on such properties 
as the friction and compressibility of the 
handcovering materials over the fingers (3). 
It seems reasonable to suppose that a mini- 
mum amount of compression of the glove ma- 
terials is necessary to transmit the knowledge 
to the wearer that the object is being held 
securely. For very light objects, such as the 
empty cylinder, the normally lower weight 
discrimination capacity expected for each S 
suggests that weight, as such, is probably not 
an important cue to the amount of prehension 
force needed to secure grasp. Force may be 


200 
DISTANCE (mm) 


400 400 


200 
WEIGHT (gm) 


Mean values for statistically significant variables. 


applied at an arbitrary level to prevent slip- 
ping between the object-glove-hand interfaces. 
The variable of weight becomes important 
when the friction between the glove and hand 
and/or glove and object is exceeded so that 
slipping begins. As the weight of the object 
is increased, more prehension force is required 
to assure secure grasp. To a point the glove 
materials will compress as a function of the 
weight, after which the compression will be 
maximum. 

This exploratory experiment seems to sug- 
gest strongly that variations in prehension 
force as affected by physical variables inher- 
ent in the task and possibly also by changes 
in the amount of tactual sensory information 
received by the operator during task perform- 
ance may be of critical importance in manual 
skill. We feel, therefore, that some measure 
of prehension force has potential value as an- 
other index of motor skill. 


Summary 


An exploratory investigation of the effects 
of certain physical variables upon changes in 
thumb-fingertip grasp forces during manipu- 
lation has been conducted for a light psycho- 
motor task. The rationale for this study was 
based upon the opinion that a dynamic meas- 
urement of prehension forces might provide 
information about perceptual aspects of mo- 
tor skills not accounted for by other perform- 
ance measures. 
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The task consisted of simple grasp, trans- 
port, and release of a cylindrical object into 
designated holes of a formboard. This object 
was instrumented with a pressure transducer 
permitting continuous recording of grasp force 
variations. The task was administered in a 
factorial treatment by subject design to six 
engineering students and the following vari- 
ables were investigated at several levels: (a) 
handcoverings; (6) object weight; (c) dis- 
tance moved; (d) direction of movement. 
Analysis of variance indicated that hand- 
coverings, weight, and distance exert a sig- 
nificant effect upon prehension force during 
a given task. 

It was concluded that this measurement 
seems to have potential value as an index of 
motor skills in that it is probably sensitive 
to changes in the amount of tactile sensory 


information available as well as to physical 
variables of the task. 


Received March 22, 1957. 
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The Edwards Personal Preference Schedule (EPPS) and 
Fakability * 


Bernard Borislow 


University of Pennsylvania 


Edwards (2) has constructed the Edwards 
Personal Preference Schedule to measure the 
magnitude of fifteen “needs” (after Murray, 
4) assumed to be operative in the “normal 
adult personality.” 

The EPPS is a binary forced-choice objec- 
tive-type inventory the construction of which 
is based upon a novel and ingenious matching 
technique in an attempt to reduce the ap- 
pearance of respondent “faking.” Based upon 
earlier work (1), Edwards has equated his 
item alternatives on the basis of a social de- 
_sirability continuum still maintaining the dis- 
criminatory power between the alternatives 
available in any one item of the inventory. 
In this way he hoped to eliminate choices 
that were made on the basis of the greater 
social desirability of one of the alternatives. 
It appears that social desirability is a cri- 
terion which a respondent can use when he 
attempts to “fake” his answers to a person- 
ality inventory. It has been assumed that 
when the social desirability of the alterna- 
tives cannot be discriminated the respondent 
will find great difficulty in misrepresenting 
his “personality traits.” 

Recently, Rosen (6) has introduced an- 
other aspect of desirability into the area of 
inventory fakability termed personal desir- 
ability. This idea is not entirely new. Rog- 
ers and Dymond (5) used the concept of the 
. “ideal self” in evaluating the outcomes of 
psychotherapy. Essentially, personal desir- 
ability is the choice of traits on the basis of 
“how the individual would like to be” rather 
than “how the individual thinks he is” (self- 
appraisal). We are confronted with at least 
two aspects of fakability or desirability—so- 
cial and personal. 

We must introduce another consideration at 
this point. The fakability of a personality 

1The author gratefully acknowledges the counsel 


and encouragement of M. S. Viteles in the prepara- 
tion of this report. 


inventory offers no real problem if the falsi- 
fied responses can be detected. If a respond- 
ent has “faked” his answers and this behav- 
ior can be identified we have gained some 
information about the respondent’s “person- 
ality” even though we must disregard the re- 
sults of the inventory. 

That Edwards has performed a remarkable 
task with a great deal of ingenuity in con- 
structing the EPPS cannot be disputed. How- 
ever, we are confronted with the question of 
how well Edwards has succeeded in eliminat- 
ing the appearance of “faking” on the EPPS 
and, more important, to the extent that “fak- 
ing” does occur can such behavior be de- 
tected. 

Certain definitions introduced now will 
prove helpful. The profile correlation is an 
index of the relationship between the profiles 
yielded by two individual administrations of 
the EPPS to the same subject. It is derived 
by ranking and correlating the T scores ob- 
tained for each of the fifteen scales from one 
administration of the EPPS with the T scores 
obtained from a second administration. The 
consistency score is the only direct and im- 
mediate device for determining the “honesty” 
of the respondent’s behavior. It is based 
upon fifteen duplicated items “built into” the 
inventory with a view toward using the re- 
sponses as a check on the consistency of the 
respondent in answering the inventory. The 
profile stability coefficient is an index of the 
uniformity of the responses (that go to make 
up each of the fifteen scales) distributed 
across the answer sheet. Since half of the 
raw score for each scale is derived from the 
rows and half from the columns of the an- 
swer sheet, it is possible to correlate these 
half scores by their respective ranks hori- 
zontally and vertically to yield the profile 
stability coefficient. This is another, although 
more indirect, check on respondent bias. Fi- 
nally, the group profile is the profile derived 
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from the pooling of individual subject profiles 
(by a procedure to be discussed) in the same 
treatment group. 


Hypotheses Tested 


1. Profile correlations derived from intra- 
subject comparisons of the administration of 
the EPPS under standard conditions (self- 
appraisal) and then under an experimental 
condition (a “mental set” to “fake” either 
socially or personally desirable traits) will 
differ significantly from those obtained from 
a comparison of two standard (self-appraisal) 
administrations. 

2. Consistency scores derived from the ad- 
ministration of the EPPS under a “mental 
set” to “fake” (a) socially desirable or (0) 
personally desirable traits will differ signifi- 
cantly from those obtained under standard 
conditions of self-appraisal. 

3. Profile stability coefficients derived from 
the administration of the EPPS under a 
“mental set” to “fake” (a) socially desirable 
or (6) personally desirable traits will differ 
significantly from those obtained under stand- 
ard conditions of self-appraisal. 

4. The number of intra-subject response 
changes that occur between the administra- 
tion of the EPPS under an initial standard 
(self-appraisal) administration and then un- 
der an experimental administration (either 
social or personal desirability “set”) will 
differ significantly from the changes that oc- 
cur between two standard (self-appraisal) 
administrations. 

5. The group profile derived from the ad- 
ministration of the EPPS under a “mental 
set” to “fake” socially desirable traits will 
differ significantly from that obtained under 
a “mental set” to “fake” personally desirable 
traits. 


Method 


Subjects. The experimental sample was selected 
at random from a larger group of volunteers in an 
elementary psychology course during the summer 
session of 1956, at the University of Pennsylvania. 
The sample was divided into three groups by a ran- 
dom procedure! after the completion of the first 
phase of the study (initial self-appraisal testing). 

1 With one restriction. Equal or nearly equal num- 
bers of males and females were assigned to each 
group. 


The three groups were labeled Control, Social Desir- 
ability (SD) and Personal Desirability (PD) groups 
according to the procedure (discussed below) of the 
second phase of the experiment. The Control group 
consisted of three males and three females with an 
age range of 18-24 years and with a median age of 
19 years. The SD group consisted of three males 
and three females with an age range of 18-28 years 
and with a median age of 21.5 years. The PD group 
consisted of four males? and three females with an 
age range of 20-27 years and with a median age of 
20 years. The sex and age compositions of the three 
groups do not differ significantly and by definition 
are samples drawn from the same population. 

Procedure and Instructions. Phase 1 consisted of 
administering the EPPS to three groups of size N 
= 6, N=6, and N = 7, using the standard adminis- 
tration instructions (self-appraisal). Each subject 
was given a code designation which he wrote on the 
answer sheet in place of his name along with his sex 
and age. The same code was used in Phase 2 by 
each subject so that Phase 1 and Phase 2 profiles 
could be matched still keeping the responses anony- 
mous. The entire group of N=19 was then sub- 
divided into the Control, SD, and PD groups. 

Phase 2 consisted of readministering the EPPS to 
all the subjects under directions appropriate to the 
group to which they were assigned. Prior to the 
second administration of the EPPS to the Control 
group, they were instructed that the second stand- 
ard administration was important to the purposes of 
the study so as to maintain an adequate level of 
motivation. Prior to the second administration of 
the EPPS to the SD group, they were informed 
that they should try to respond as they believed a 
“perfect individual characterized by those traits that 
society considers highly desirable” would respond. 
Prior to the second administration of the EPPS to 
the PD group, they were informed that they should 
try to respond according to how they would “like 
to be” rather than how they “actually are.” 

An interval of two weeks elapsed between Phase 1 
and Phase 2. 


Results 


Table 1 shows the profile correlations ob- 
tained from the three groups.* For each sub- 
ject, the profile correlation indicates the de- 
gree of relationship between his initial self- 
appraisal profile (Phase 1) and his second 
profile taken under either self-appraisal, so- 
cial desirability, or personal desirability in- 
structions depending upon the group to which 
he was assigned (Phase 2). 

A significant difference (P = .002, Mann- 


2 The additional male reported for the experiment 
“by accident” and it was decided to use him. 

$In essence, those shown for the Control group 
are test-retest reliability coefficients. 





Bernard Borislow 


Table 1 


Individual] Profile Correlations 





Control SD 





91 
85 
85 
17 
.70 
65 





Whitney U test) exists between the Control 
and SD groups and between the Control and 
PD groups (P < .004, Mann-Whitney U test). 
There is no significant difference between the 
two experimental groups, SD and PD (P > 
.20, Mann-Whitney U test). Therefore, Hy- 
pothesis 1 is held tenable. The influence of 
a mental set to “fake,” either under social 
or personal desirability instructions, has pro- 
duced personality profiles significantly dif- 
ferent from profiles obtained under self-ap- 
praisal conditions. 

Table 2 shows the consistency scores com- 
puted for both administrations of the EPPS 
for each subject.* 

The consistency score derived from the first 
administration of the EPPS was compared 
with the score derived from the second ad- 
ministration for each subject within the three 
groups. There were no statistically significant 
(P’s > .05, Wilcoxon matched-pairs signed- 
ranks test) or practical differences. 

Comparisons between groups under Phase 1 
(initial self-appraisal condition) showed no 
statistically significant (P’s > 30, Mann- 
Whitney U test) difference between the sam- 
ples of the consistency scores. 

Comparisons between groups under Phase 2 
indicated that both the Control and PD groups 
were statistically more consistent than the SD 
group (P’s < .05, Mann-Whitney test); there 
was no statistically significant difference be- 
tween the Control and PD groups (P > .418, 
Mann-Whitney U test). 


*Even though the groups are listed separately, 
Phase 1 consisted of a uniform administration of the 
EPPS to all subjects under standard instructions of 
self-appraisal. 


It appears as though self-appraisal and per- 
sonal desirability mental sets result in signifi- 
cantly more consistent responses than does a 
social desirability set. This result is not sur- 
prising if we recall that Edwards constructed 
the EPPS so that social desirability as a cri- 
terion for “faking” behavior would be opti- 
mally eliminated. 

Further observation of the consistency 
scores shows that five of the scores fall be- 
low Edwards’ lower limit of acceptability 
(score below 10 indicates an inconsistent and 
therefore questionable profile). Four of those 
scores come from the group of 25 profiles de- 
rived from self-appraisals. Only one comes 
from the group of 13 faked profiles. 

Therefore, on the basis of practical signifi- 
cance (in addition to the statistically non- 
significant findings between the Control and 
PD groups) we must say that the consistency 
score cannot discriminate faked profiles from 
self-appraisal profiles. Hypothesis 2 is re- 
jected. 

Table 3 shows the profile stability coeffi- 
cients obtained for all subjects under both 
administrations of the EPPS.° 

All comparisons between groups for both 
Phase 1 and Phase 2 yield nonsignificant dif- 
ferences for samples of profile stability coeffi- 
cients (P’s > .05, Mann-Whitney U test). 

Therefore, Hypothesis 3 must: be rejected. 
Profile stability does not deteriorate under 
faked conditions when compared to self-ap- 
praisal results. In fact, there is some evi- 
dence to indicate that faking under a per- 
sonal desirability mental set yields a profile 


Table 2 
, Consistency Scores 
Phase 1 


Control SD 


Phase 2 


Control 


PD 


13 
13 
12 
12 
10 
10 

8 


5 See Footnote 4. 
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significantly more stable than does the self- 
appraisal condition. This evidence is derived 
from a within-group comparison for the PD 
group. Profiles were more stable under the 
personal desirability condition than under the 
self-appraisal condition for this group (P = 
.05, Wilcoxon matched-pairs signed-ranks 
test). 

Table 4 shows the number of changed re- 
sponses (number of items answered differ- 
ently in Phase 2 when compared to Phase 1) 
for each subject in the three groups. 

Significantly more responses per subject 
were changed by Group SD than by the Con- 
trol group (P < .002, Mann-Whitney U test). 
Similarly, more responses per subject were 
changed by Group PD than by the Control 
group (P < .01, Mann-Whitney U test). 
There is no difference between the experi- 
mental groups, SD and PD (P > .50, Mann- 
Whitney U test). 

Therefore, Hypothesis 4 is held tenable. 
Faking produces more response changes per 
subject than does a re-self-appraisal. This 
result seems to be an obvious one. If faking 
produces different profiles, the only way this 
can come about is by way of responding 
differently. However, if responses changed 
greatly for the Control group then we would 
be unable to attribute response changes in the 
experimental groups to the experimental vari- 
ables. If there were no differences in changed- 
response scores between the Control and ex- 
perimental groups we would then have to at- 
tribute response changes in these latter groups 


Table 3 


Profile Stability Coefficients 


Phase 1 


Phase 2 


Control SD PD 


Control SD 
.99 86 84 99 
85 81 82 

.63 74 79 80 
59 .63 73 75 
| 53 59 54 
40 38 4 33 

34 


Table 4 


Changed Response Scores 


Control 


47 
38 
34 
32 
32 
30 


to error variance (unreliability of the instru- 
ment as well as intra-organismic changes) .* 

In order to determine if a mental set to 
fake socially desirable responses is different 
from a mental set to fake personally desirable 
responses we would need significant within- 
groups homogeneity and between-groups het- 
erogeneity in a comparison of Groups SD and 
PD. 

A low but significant relationship exists be- 
tween the individual profiles of the subjects 
in the SD group derived from the faked situa- 
tion (coefficient of concordance, W = .382, 
P< .01). Similarly, a low but significant 
relationship exists between the individual pro- 
files of the subjects in the PD group derived 
from the faked situation (coefficient of con- 
cordance, W = .255, P< .05). Using the 
sums of ranks for each scale under the SD 
and the PD conditions (derived from the 
above calculations of concordance) we can 
rank the scales under each condition. This 
is probably the best estimate of the true psy- 
chological ranking for each group across all 
scales. There is no significant difference be- 
tween the scale rankings of Groups SD and 
PD (P > .05, Wilcoxon matched-pairs signed- 
ranks test). 

Therefore, Hypothesis 5 must be rejected. 
We cannot say that faking under a social de- 
sirability mental set is different from faking 
under a personal desirability mental set on 
the EPPS. 

Another interesting result has been obtained 
which casts doubt upon the use of the con- 


® See Discussion section for further implications of 
response changes. 
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sistency score and the profile stability coeffi- 
cient as measures of response coherency or 
fakability. There is no significant relation- 
ship between profile stability coefficients and 
consistency scores (Spearman rho coefficient 
= 35, P’> .05). This result is based upon 
the 19 profiles derived from self-appraisal 
conditions (Phase 1) where all subjects in 
the study were subjected to identical condi- 
ditions taking the EPPS under standard test 
instructions and anonymity. There were no 
significant differences between men and women 
either on samples of profile stability coeffi- 
cients or on consistency scores (P’s > .10, 
Mann-Whitney U test). 

Finally, it should be noted that Edwards 
has forcefully eliminated the effects of social 
desirability as an influential determiner of 
fakability on the EPPS. Although the SD 
group was able to coherently falsify its 
profile results, only a low relationship (W = 
382) exists between the profile patterns of 
the subjects in that group. That the rela- 
tionship is low indicates that different item 
alternatives were chosen as being socially de- 
sirable for the different subjects. Further, 
even though 5 of the 6 Consistency scores for 
the SD group (under Phase 2) are considered 
“acceptable” (see Table 2), these scores are 
generally lower than either the Control or PD 
groups. This indicates that, when a respond- 
ent attempts to choose socially desirable an- 
swers on the EPPS, his responses are apt to 
be less consistent than if he were to answer 
on a self-appraisal or personal desirability 
basis. 


Discussion 


It is possible to give a stable and consist- 
ent pattern of responses and yet present a 
completely misleading picture of one’s per- 
sonality (in terms of a need system) as meas- 
ured by the Edwards Personal Preference 
Schedule. The EPPS can be faked without 
detection where the easily-derived consistency 
score and the more difficultly derived profile 
stability coefficient are used as the indices of 
detection. 

When we examine the tenability of Hy- 
pothesis 4, which deals with absolute num- 
bers of changed responses, we are faced with 


this argumentative question: In spite of the 
fact that the experimental groups showed a 
significantly greater number of changed re- 
sponses than the Control group, is it not pos- 
sible that the SD and PD group subjects 
changed their responses merely because they 
interpreted their instructions to mean “change 
your answers from the answers you gave last 
time”? That is, the subjects might not have 
been able to abide by social or personal de- 
sirability instructions and were forced to 
choose alternatives on a random basis. This 
would also yield a great number of changed 
responses. 

There is overwhelming evidence that the 
SD and PD subjects were not responding on 
a purely random basis. The very nature of 
the consistency score (number of 15 dupli- 
cated items responded to identically) over- 
rides the proposed argument. The probabil- 
ity of the 13 observed consistency scores for 
Groups SD and PD (under Phase 2, the ex- 
perimental condition) occurring by chance is 
less than one in a thousand (,? = 123.9, 26 
df; see 3, pp. 103-105). 

Although Hypothesis 5 has been rejected, 
there is evidence to believe that social desir- 
ability and personal desirability are distinct 
concepts and were successfully induced and 
manipulated in this study. The first bit of 
evidence is the fact that the choice of al- 
ternatives using a social desirability criterion 
seems to be more difficult than the use of a 
personal desirability criterion on the EPPS; 
the responses of subjects under the PD con- 
dition were significantly more consistent than 
under the SD condition. Secondly, the con- 
cordance of profiles under the two conditions 
is somewhat different; more concordance ex- 
ists for the SD profiles, as would be expected. 
Finally, a negative (non-significant) relation- 
ship is present between group profiles, based 
on sums of scale ranks, between the two 
groups. 


Summary and Conclusions 


This study was designed to determine if 
the Edwards Personal Preference Schedule 
(EPPS) could be “faked” without detection 
in a laboratory situation using college stu- 
dents as subjects under anonymous condi- 
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tions. Nineteen subjects (10 men and 9 
women) took the EPPS under standard con- 
ditions (self-appraisal). Two weeks later, 
three groups consisting of approximately an 
equal number of men and women were ran- 
domly constituted; one was the Control group 
(self-appraisal retest) and two were experi- 
mental groups (Social Desirability retest 
group and Personal Desirability retest group). 
Consistency scores, profile stability coeffi- 
cients, individual profile correlations were 
computed and statistical tests were employed 
to test certain hypotheses. The following 
conclusions seem warranted: 

1. The Edwards Personal Preference Sched- 
ule can be faked under structured personal 
and social desirability instructions. 

2. The consistency score and the profile 
stability coefficient are not adequate indices 
of inventory fakability. 

3. There is evidence that differential cri- 
teria exist for fakability in terms of desir- 
ability of response alternatives—social and 
personal. 

4. The Edwards Personal Preference Sched- 


ule is not greatly susceptible to the influence 
of fakability in terms of choice of socially de- 
sirable items, per se. 


Received April 1, 1957. 
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Air Force Personnel and Training Research Center 


In the course of investigating the relation- 
ship between the learning of artificial language 
materials of different degrees of organization 
and the learning of English, we obtained re- 
call and word-prediction scores for a number 
of English passages of approximately equal 
length. It seemed to us that these data af- 
forded an excellent opportunity for determin- 
ing the interrelationship of the readability of 
the passages, the ease with which they were 
learned, and the degree to which their con- 
stituent words were predictable. There was 
reason to believe that significantly positive 
intercorrelations would. be found. Consider 
learning and prediction. If we follow infor- 
mation theorists in accepting the close con- 
nection between prediction and amount of in- 
formation, then the findings of Miller and 
Selfridge (5) or those of the present authors 
(1, 7) indicate that a substantial correlation 
between amount learned and success in pre- 
diction should exist. As for learning and 
readability, the likelihood that they are cor- 
related is strongly suggested by the fact that 
both are related to ease of comprehension. 
Reed (6), for example, reports that Ss were 
able to learn a passage of easy English prose 
in less than half the time required to learn a 
passage taken from the writings of Hume. In 
addition, it may be noted that the discrimi- 
nation power of reading-ease formulas was 
initially tested against a criterion of compre- 
hensibility. With regard to prediction and 
readability, Taylor’s (8) study, though in- 
volving only small groups of Ss and a very 


1 This report is based on work done under ARDC 
Project No. 7730, Task No. 17125, in support of the 
research and development program of the Air Force 
Personnel and Training Research Center, Lackland 
Air Force Base, Texas. Permission is granted for re- 
production, translation, publication, use, and dis- 
posal in whole or in part by or for the United States 
Government. 

2 Now at The Operational Applications Laboratory, 
Air Force Cambridge Research Center, Bolling Air 
Force Base, Washington 25, D. C. 

3 Now at The Industrial College of the Armed 
Forces, Washington, D. C. 
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small number of language samples, indicates 
that “Cloze” procedure—essentially a meas- 
ure of predictability based upon a knowledge 
of both the preceding and succeeding con- 
text—tends to rank prose passages of differ- 
ent levels of difficulty in about the same way 
as the readability ratings of the two most 
widely used formulas. For example, Taylor 
obtained a correlation of .46 between Cloze 
and Dale-Chall rankings even when his data 
consisted of passages deliberately selected to 
amplify the weaknesses of current methods of 
measuring readability. : 


Materials 


Thirty passages, each approximately 200 
words in length, were selected from such 
widely varying sources as children’s stories 
(10 passages), popular magazines of the 
Saturday Evening Post variety (10 passages), 
higher quality magazines (e.g., The New 
Yorker), scientific texts, and philosophical 
writings (10 passages). Each passage was 
reproduced on a single sheet of paper, single 
spaced, each sentence beginning on a new 
line (for purposes of reading similarity with 
other materials). The passages bore no 
identification regarding author, source, or 
level of reading difficulty. 


Readability 


Two measures of the readability of these 
passages were employed: the Flesch reading- 
ease formula (3) and the Dale-Chall formula 
for predicting readability (2). The first 100 
words of each passage were examined to ob- 
tain the mean number of syllables per word 
for the Flesch score and to obtain the per- 
centage of infrequent words according to 
Dale-Chall. These same 100 words were used 
to obtain the mean sentence length. When 
the hundredth word was not the final word of 
a sentence, the sentence containing the hun- 
dredth word was included or excluded, de- 
pending upon whether exclusion or inclusion 
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Table 1 


Summary Statistics for the Variables 





Range of 
Variable Scores* Mean SD 
Flesch Readability 97.4—- 1.0 56.93 23.65 
Dale-Chall Readability 4.2-10.7 7.44 1.96 
Amount Learned 27.4-10.1 18.50 5.12 
Prediction 14.0- 6.4 9.03 2.07 


* From “easy” to “‘difficult.” 





brought the sample size closer to 100. The 
means and standard deviations of the dis- 
tributions of readability scores are shown in 
Table 1. 


Amount Learned 


Subjects. One hundred and twenty-five col- 
lege students (82 males, 43 females; 26 Fresh- 
men, 40 Sophomores, 27 Juniors, 25 Seniors, 
2 Graduate, and 5 unclassified students) were 
selected for participation in a learning experi- 
ment involving protracted training in the mas- 
tery of artificial language materials and in the 
memorization of language samples (both arti- 
ficial and English) for five different study- 
time intervals. These 125 students were 
screened from a recruitment of about twice 
that number on the basis of age, level of edu- 
cation, and other traits and abilities as meas- 
ured by the following tests: the ACE (1949 
Edition), the Digit Span subtest of the Wechs- 
ler-Bellevue Intelligence Scale, the Minnesota 
Multiphasic Personality Inventory, and the 
Clyde Projective Test. The selection pro- 
cedures were aimed at obtaining a group of 
Ss somewhat heterogeneous with regard to 
intelligence arti rote memory, but relatively 
homogeneous with regard to motivation to 
complete the experiment and the capacity to 
perform with reasonable consistency under 
stress. 

Training. The Ss attended 36 hour-long 
training sessions distributed over a period of 


4 The training and testing of Ss was carried out 
under Contract AF 41(657)—59 with the Auburn Re- 
search Foundation, Inc., Alabama Polytechnic Insti- 
tute, Auburn, Alabama. Contract activities were un- 
der the direction of Willard H. Nelson, Principal 
Investigator, and Virginia Zachert, Research Super- 
visor. 


about 10 weeks. Seven of these sessions were 
devoted to practice in the memorization of 
English passages of the same length and of 
the same range of difficulty as the 30 experi- 
mental passages described above. All Ss re- 
ceived the same amount of practice in five 
study-time intervals: one-half minute, one 
minute, two minutes, three minutes, and four 
minutes. Practice passages were placed in 
front of the Ss face down, and the study 
time to be allowed for memorization was an- 
nounced. At the signal to start, Ss turned the 
passage over and began memorizing. They 
had been coached in scanning passages from 
the first word on—attempting to learn in nor- 
mal reading sequence rather than by skipping 
around. At the signal to stop, Ss turned their 
passages face down and immediately recorded 
all they could recall on prepared answer 
sheets. The Ss then engaged in five minutes 
of interpolated activity, followed by another 
period of memorization (study-time intervals 
and levels of reading difficulty were random- 
ized among and within training sessions), and 
so on. 

Experimental groups. When training pro- 
cedures were completed, the Ss were divided 
into five matched groups on the basis of their 
scores on the following: ACE Quantitative, 
artificial language—low organization, artificial 
language—high organization, and English. 
The experimental groups were now adminis- 
tered the 30 experimental English passages, 
each group taking each passage for only one 
of the five study-time intervals. Both study 
times and levels of reading difficulty were 
randomized among groups and within experi- 
mental sessions. This procedure was carried 
out over a period of one week, with six pas- 
sages administered per daily session. 

Scoring. Two criteria had to be satisfied 
for a word to be considered correctly learned: 
First, the word had to be reproduced unam- 
biguously, as it appeared in the passage. For 
example, Houses was not accepted as correct 
if the word in the passage was house. Second, 
the word had to be reproduced in the correct 
serial position. Considerable latitude was al- 
lowed here. A word was considered to be re- 


produced in the correct position if it occurred 
in the proper serial position either with re- 
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spect to the beginning of the passage or to 
the beginning of the appropriate sentence. In 
the latter instance, however, the word in ques- 
tion had to be part of a sequence of at least 
two words which was correctly located within 
the sentence. 

The mean amount learned per minute of 
study time was taken as the learning score for 
each passage. This was obtained by sum- 
ming the mean amounts learned in the five 
study-time intervals and dividing by 10.5, the 
sum of the study times. The mean and 
standard deviation of the distribution of 
learning scores are shown in Table 1. 


Prediction 


Subjects. Thirty-seven college students 
drawn from the same population as those who 
participated in the learning experiment (but 
excluding those who actually did participate) 
were selected to take part in a prediction ex- 
periment employing the same 30 English pas- 
sages. These students were selected to match 
the Ss used in the learning experiment on the 
basis of age, sex, level of education, and their 
scores on the ACE Total, L, and Q. 

Procedure. The Ss were given the first 
word of a passage and instructed to guess the 
next word. After they wrote their guesses in 
black pencil, they were told the correct word, 
ie., the. one actually occurring in the pas- 
sage. They were then instructed to write this 
word in red pencil on the line above the word 
they had guessed, even if they had guessed 
correctly. Now they were told to guess the 
next word, given the correct word, told to 
record it above their guess, and so on. The 
Ss were repeatedly instructed to read through 
all of the passage covered thus far (which they 
had written in red above their guesses) be- 
fore guessing the next word. Periodically, 
the administrator would read aloud the part 
of the passage already covered. Several prac- 
tice passages were sequentially predicted in 
this manner before the 30 experimental pas- 
sages were administered. The experimental 
' passages themselves were then taken in 20 
_ daily sessions, with levels of reading difficulty 
randomized among sessions. 

Since it would be fairly tempting for Ss to 
alter their guesses during the testing pro- 


cedure, measures were adopted to prevent 
cheating. The Ss met in small groups of 
from four to ten persons, overseen by an ad- 
ministrator and a proctor. Furthermore, each 
answer sheet had a carbon paper and another 
sheet stapled under it, so that any erasure 
would appear as a smudge on the bottom 
sheet. 

Scoring. For a guess to be counted correct, 
the S had to write the word in exactly the 
same form as it appeared in the passage. 
Misspellings were considered correct only 
when they were unambiguous. 

The mean number of correct predictions 
per word, computed from the guesses on words 
2-67 in each passage, was taken as the pas- 
sage prediction score. The 67th word had to 
be taken as the limit since the prediction ex- 
periment was carried out only up to the maxi- 
mum amount learned in each passage and 
there was one passage in which no S was 
able to recall any word beyond the 67th. The 
mean and standard deviation of the distribu- 
tion of prediction scores are shown in Table 1. 


Results and Discussion 


The coefficients of correlation shown in 
Table 2 are all significant beyond the .01 
level and may be taken as estimates of the 
degree of interrelationship between learning, 
predictability, and readability as measured by 
the techniques employed in this study. Dale- 
Chall scores appear to correlate more highly 
than Flesch scores with both learning and 
prediction, though only in the case of the 
correlations with learning is the difference 
between the two formulas statistically signifi- 


Table 2 


Intercorrelation of the Variables 


Dale-Chall Amount  Pre- 
aaa semt3 Learned diction 





Flesch Readability 918 61 4 
Dale-Chall Readability 758 .60* 
Amount Learned 73 


le-Chall readability scores are inverse to the scores 
derived from the other three measures (see Table 1), a — 


* Since Dal 


relationship with the Dale-Chall yields a negative 
To avoid aie. the negative sign has been removed. 
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cant (p < .02).° Klare (4) similarly pre- 
sents evidence that the Dale-Chall is superior 
to the Flesch formula in rating reading-test 
passages, though the differences he obtained 
are statistically nonsignificant. If an ample 
relationship between ease of learning and ease 
of comprehension were demonstrable, meas- 
ures of learning might provide more reliable 
criteria for the effectiveness of readability rat- 
ings than tests of comprehension, often diffi- 
cult to scale and control. 

It is interesting to observe that the correla- 
tion between amount learned and prediction 
does not differ significantly from the correla- 
tion between amount learned and the Dale- 
Chall. One would expect that prediction 
would correlate more closely with learning 
than either readability formula since predic- 
tion and learning both involve the factor of 
contextual constraint. Most likely, the meas- 
ure of predictability employed in this study 
was not as sensitive as it might be. Rela- 
tively few words could be successfully pre- 
dicted on the basis of one trial and from a 
knowledge of the preceding context alone. 
Other measures of predictability, particularly 
those based on a knowledge of the context on 
both sides of the word or based upon some 
index of intersubject agreement would prob- 
ably show a higher correlation between these 
two variables than is here indicated. 

This method of prediction may also be re- 
sponsible for differences in the magnitude of 
the relationship between readability and pre- 
dictability obtained here and obtained by 
Taylor (8). Taylor reports a maximum cor- 
relation of .94 between readability rankings 
assigned by Cloze procedure and by the Dale- 
Chall, and a maximum correlation of .71 be- 
tween rankings assigned by Cloze procedure 
and by the Flesch formula. Of course, these 
coefficients are based on data obtained from 
only six passages; nonetheless, it is quite pos- 
sible that these higher correlations result from 
a more sensitive measure of predictability. 

Despite the differences between the two 


5 Since this and all further comparisons involved 
sets of bivariates having one array in common, 
Hotelling’s test for differences between correlated co- 


efficients was employed. All statements of signifi- 
cance are based upon p values for the distribution 
of t at 27 degrees of freedom. 


readability formulas in the degree to which 
they correlate with learning and prediction, 
they exhibit the same high degree of intercor- 
relation reported by other investigators (4, 8). 
They seem to be measuring the same factors 
—probably grammatical complexity (through 
the measure of sentence length) and vocabu- 
lary level. Judging from the apparent su- 
periority of the Dale-Chall when both for- 
mulas are tested against outside criteria, how- 
ever, it seems that the number of unfamiliar 
words in a passage of English gives a better 
estimate of vocabulary level than word length 
in syllables. 


Summary and Conclusions 


Subjects previously given intensive practice 
in memorizing English passages of a wide 
range of reading difficulty were assigned the 
task of learning as much as they could of 30 
experimental English passages in set periods 
of study. Another group of Ss, selected to 
match the learning group, went through these 
same 30 passages, predicting each successive 
word (only one guess allowed) from a knowl- 
edge of all the preceding context. Readabil- 
ity scores were calculated for each passage 
according to the Flesch and Dale-Chall for- 
mulas. Product-moment correlations were 
computed between the mean amounts learned, 
the mean aumber of correct predictions per 
word, the Flesch, and the Dale-Chall read- 
ability scores. Intercorrelations among the 
variables showed that: 

1. Learning, prediction, and readability are 
closely interrelated. 

2. Prediction and readability correlate about 
equally well with learning. However, the 
method of prediction employed in this study 
involved a single guess based upon a knowl- 
edge of the preceding context alone, and it 
may be that such a method yields a relatively 
insensitive measure of predictability. 

3. The Dale-Chall formula correlates sig- 
nificantly more closely than the Flesch for- 
mula with learning, and somewhat more 
closely—though not significantly so—with 
prediction. Whether this is to be taken as 
a demonstration of the superiority of the 
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Dale-Chall formula, however, depends upon 
the degree of relationship between learning 
and comprehension. It is suggested that if 
ease of learning is shown to be sufficiently 
related to ease of comprehension, amount 
learned might provide a better criterion of 
readability than the more difficult to control 
tests of comprehensibility. 

4. Despite observed differences between the 
two readability formulas, they were found to 
correlate very highly with each other, thus 
supporting the notion advanced by other in- 
vestigators to the effect that the two formulas 
measure substantially the same things. 

5. Both readability formulas showed a 
higher correlation with learning than with 
prediction, but the differences were not sta- 
tistically significant. 


Received April 3, 1957. 
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Relationship Between Stated and Measured Interests of 
Two Groups of United States Air Force Officers ’ 


Paul G. 


Macalester College, 


In a study conducted by the staff of the In- 
dustrial Relations Center at the University 
of Minnesota for the United States Air Force, 
Strong Vocational Interest Blanks (SVIB) 
and personal history questionnaires were com- 
pleted by Air Force Officers in the person- 
nel and accountant-comptroller areas. On the 
personal history questionnaires the officers in- 
dicated their choice of civilian occupation. 
The relationship between this civilian choice 
of occupation (stated interests) and meas- 
ured vocational interests is the subject mat- 
ter of this report. 


Methods and Procedure 


Completed materials were obtained from Air Force 
Officers who had Air Force specialty code numbers 
in the personnel or accountant-comptroller areas. In- 
cluded in this study are returns from 1155 personnel 
officers and 243 accountant-comptrollers. These N’s 
represent about an 84% return of the material which 
was originally sent. 

Expert judgments were used in two different ways. 
Three judges* independently interpreted the SVIB 
profiles using the Darley technique (1). Judgments 
of primary interest patterns were the basis for de- 
termining measured vocational interests. For a pri- 
mary interest pattern to exist a majority or plurality 
of scores in an occupational groups had to be A or 
B + scores. 

As interests are judged in terms of scores for oc- 
cupational groups, it was decided to judge those oc- 
cupational groups which contain a single occupation 
along with the group of occupations with which it 

1 The original study of which this report is a part 
was supported by the United States Air Force un- 
der contract no. AF 18(600) 337 and was monitored 
by the Officer Personnel Division, Human Resources 
Research Institute, Air Research and Development 
Command, Maxwell Air Force Base, Alabama. The 
opinions and conclusions expressed herein are not to 
be construed as necessarily carrying the official sanc- 
tion of the Department of the Air Force or of the 
Air Research and Development Command. 

2 This paper is a part of the author’s Ph.D. thesis. 
The author is indebted to his major advisor, Donald 
G. Paterson, and to Marvin D. Dunnette, George 
W. England, Donald P. Hoyt, Thomas M. Magoon, 
and Harry Roadman for their generous assistance. 

® The judges were Donald G. Paterson, Marvin D. 
Dunnette, George W. England, and Harry Roadman. 


Jenson * 
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. Paul, Minnesota 


was most closely related. Relatedness was deter- 
mined by Strong’s published intercorrelations (3) 
and by judgmental factors. Thus, Group III which 
consists solely of the Production Manager scale was 
included with Group IV, the technical group; Group 
VII which is the Certified Public Acountant scale 
was included with Group VIII, the business detail 
group; and Group XI, the President of Manufac- 
turing Concern scale, was included with Group IX, 
the sales group. Because there is little relationship 
between Group VI, the Musician scale, and other 
occupational groups it was decided to exclude that 
group from the analyses. 

One other modification of the SVIB profile was 
made. Because the major study which used these 
data was primarily concerned with interests of men 
in the personnel and accountant-comptroller areas, 
it was decided to include the Personnel Director and 
Public Administrator scales as a separate group from 
the rest of the occupations in the social service group. 

The names and numbers of the occupational groups 
on the SVIB which were used in this study are as 
follows: I. Biological Sciences; II. Physical Sciences ; 
III-IV. Technical; Va. Personnel; V. Social Service ; 
VII-VIII. Business Detail; IX-XI. Business Con- 
tact; X. Verbal-Linguistic. 

In order to say that an interest pattern existed, 
at least two of the three judges had to indicate the 
presence of the pattern. The judging was consistent. 
There was 90% agreement between at least two out 
of three judges in the judging of primary patterns. 

Judges were also used in categorizing the civilian 
choice of occupation. The question on the personal 
history questionnaire which asked the officers to in- 
dicate their civilian choice of occupation was an 
open-ended question so there were many different 
types of responses. In order to make these data 
meaningful these occupations were categorized in 
terms of the occupational groups on the SVIB 

Three qualified judges* determined the occupa- 
tional group on the SVIB to which a particular oc- 
cupation would most likely belong. “Belongingness” 
was in terms of vocational interests.5 Little diffi- 
culty was encountered when there was an occupa- 
tional scale on the SVIB for the occupation selected. 
However, difficulty was encountered with occupa- 
tions which seemingly represented a combination of 
interests and where the selected occupation was so 


‘These judges were Paterson, Donald P. Hoyt, 
and Thomas M. Magoon. 

° This type of judgment has fairly high validity in 
spite of the known ambiguity in occupational titles 
as shown by Strong (4, pp. 13 and 91 f.). 
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vague and general as to preclude classification. Ex- 
amples of occupations which were difficult to judge 
and where there was little or no agreement were 
management consultant, managing a sports team, 
radio work, and aircraft manufacturing. 

Criterion established for including a selected oc- 
cupation in an occupational group was agreement 
between at least two out of the three judges. This 
judging was consistent. There was 78% agreement 
between at least two out of the three judges in as- 
signing the occupations selected by the officers to 
the various occupational groups on the SVIB. 


Results 


In Table 1 the number and percentage of 
officers with judged stated interests in the 
different occupational groups are shown. Also 
included are the number and percentage who 
have no stated interests and the number and 
percentage who have stated interests which 
were excluded from occupational groups due 
to disagreements among the judges as to the 
most appropriate occupational group. 

The judges disagreed more in assigning the 
stated interests of the accountant-comptrollers 
than of the personnel officers to occupational 
groups on the SVIB. About 22% of the ac- 
countant-comptrollers had stated interests in 
this category as compared with about eight 
per cent of the personnel officers. The ac- 
countant-comptrollers more frequently indi- 
cated choices in the business detail area and 


Table 1 


Judged Stated Interests of Personnel Officers and 
Accountant-Comptroliers 





Accountant 
Comptrollers 


Personnel 
Officers 





Per Per 


Occupational Group Cent N Cent 





Biological Sciences 1.0 9 
Physical Sciences 2.3 4 
Technical 12.2 7.4 
Personne] 33.5 1.6 
Social Service : 4.9 8 
Business Detail 13.1 50.2 
Business Contact 11.9 7.0 
Verbal-Linguistic 5.3 3.3 
No Stated Interests 8.0 
Disagreement Between 

the Judges 7.8 


1,155 100.0 243 


Jenson 


Table 2 


Relationship Between Stated Interests and Measured 
Interests for the Personnel Officers 
(Chi square = 81.3, P < .001) 





Stated Interests 
but not 
Measured Interests | 


Stated Interests 
and 
Measured Interests 


Occupational Per 
Group N Cent 





Biological Sciences* 2 
Physical Sciences s 
Technical 62 
Personnel 258 
Social Service 23 
Business Detail 70 
Business Contact 82 
Verbal-Linguistic 10 


Total S15 100.0 100.0 


* The biological science group was not included in the chi 
square because of the small N. 


in being self-employed and these were among 
the occupations about which the judges most 
frequently disagreed. 

A plurality of both the personnel officers 
and the accountant-comptrollers had stated 
interests in occupations which were similar to 
their military occupations. About one-third 
of the personnel officers selected some aspect 
of personnel work and about one-half of the 
accountant-comptrollers selected the business 
detail area. 

Because the accountant-comptrollers tended 
to choose civilian occupations in just the busi- 
ness detail area, the frequencies in the other 
occupational groups are so small as to pre- 
clude the use of extensive statistical analyses. 
Thus, Table 2 shows the agreement and dis- 
agreement between stated interests and meas- 
ured interests just for the personnel officers. 
(Excluded from the table are 182 personnel 
officers who had no stated interest or whose 
stated interest could not be placed in any 
occupational group because of disagreements 
among the judges.) 

The chi-square value of 81.3 which was ob- 
tained for the data in Table 2 is highly sig- 
nificant. There was a significantly greater 
agreement between stated interests and meas- 
ured interests for some occupational groups 
than for others. The highest agreement is in 
the personnel area. About one-half of the 
personnel officers (50.1%) have stated and 
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measured interests in the personnel area. 
There is a big drop in the number of per- 
sonnel officers who have agreement between 
stated and measured interests to the next 
highest groups which are the business con- 
tact, business detail, and technical groups. 
From these groups there is another drop to 
the social service group and then again to the 
verbal-linguistic, physical science, and bio- 
logical science groups. 

The highest percentage of personnel offi- 
cers whose stated interests do not agree with 
their measured interests is also in the per- 
sonnel area. Twenty-eight per cent of offi- 
cers who do not have agreement between 
stated and measured interests are in the per- 
sonnel area. This percentage is less than the 
percentage of officers who have both stated 
and measured interests in the personnel area. 
The fact that there is such a concentration of 
officers in the personnel area even when there 
is disagreement between stated and measured 
interests is not surprising in view of the fact 
that about one-third of all the personnel offi- 
cers selected personnel work as their civilian 
choice of occupation. Of this one-third about 
two-thirds also had measured interests in this 
area. 


Summary 


1. There was good agreement in judging 
primary interest patterns on the Strong Vo- 
cational Interest Blank for this sample of Air 
Force Officers in the personnel and account- 
ant-comptroller areas. In judging primary in- 
terest patterns the agreement between at least 
two of three judges was 90%. 

2. There was good agreement in assigning 
the civilian occupations selected by the Air 


Force Officers to the appropriate occupational 
groups on the Strong Vocational Interest 
Blank. The agreement between at least two 
of the three judges was 78%. 

3. The Air Force Officers in this study 
tended to select civilian occupations which 
were similar to their military occupations. 
About one-third of the personnel officers had 
stated interests in the personnel area and 
about one-half of the accountant-comptrollers 
had stated interests in the business detail 
area. 

4. In some occupational areas there was 
greater agreement between stated and meas- 
ured interests than in other occupational 
areas. For the personnel officers the best 
agreement was in the personnel area and for 
the accountant-comptrollers it was in the 
business detail area. However, no statistical 
analyses were made of these data for the ac- 
countant-comptrollers because of the extreme 
concentration of both stated and measured 
interests in this one area to the exclusion of 
all the other occupational areas. 
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An Evaluation of Two Attitudinal Approaches to Delegation * 


Allen R. Solem 
University of Maryland 


Current problem-solving procedures in busi- 
ness and industry indicate that there are 
many different points of view concerning the 
supervisory function of delegation (1, 2, 3, 4, 
5, 6, 7, 9, 10, 13, 14, 16, 17, 18, 19). Al- 
though there appear to be relatively few prob- 
lems in delegating the execution of decisions, 
there is a wide range of opinion concerning 
the degree to which it is advisable to share 
the decision-making function itself. Since not 
all decisions are properly subject to delega- 
tion the differences can be attributed in part 
at least to the types of problems involved. A 
more important factor, however, seems to be 
the superior’s frame of reference toward his 
job and his subordinates. Some superiors 
prefer to decide things on their own with little 
or no prior consultation; others tend to seek 
the advice of staff experts or peers before de- 
ciding and still others frequently use con- 
sultative procedures for obtaining the views 
of subordinates as a basis for their decisions. 
Despite these variations in procedure a com- 
mon factor in most approaches is that the 
superior must retain the authority to modify 
or reject ideas or decisions which do not meet 
with his approval. 

In contrast to this frame of reference is the 
attitude of placing final responsibility for cer- 
tain decisions and for the end results in one’s 
subordinates (8, 11, 13, 14). Such an atti- 
tude would imply that the superior reserves 
the right to decide what are the decisions he 
must make himself and what are those to be 
delegated. However, once the responsibility 
for making a decision or developing a solu- 
tion has been placed in one’s subordinates, 
the assumption is that the superior will ac- 
cept and support the action regardless of 
whether he personally agrees with it or not. 


1This paper is a portion of a dissertation sub- 
mitted to the graduate faculty of the University of 
Michigan in partial fulfillment of the requirements 
for the degree of Doctor of Philosophy. The author 
is indebted to all of the members of his graduate 
committee for guidance and assistance throughout 
the study and especially so to Norman R. F. Maier, 
Chairman. 


This means that subordinates are held ac- 
countable for results, not for developing solu- 
tions designed to obtain the approval of the 
superior. 

The difference between these two views of 
delegation raises a number of relevant ques- 
tions to problem-solving in management, in- 
cluding: (a) What influence, if any, does the 
delegation approach that is used have on 
solution quality? (5) Is there any difference 
between the two approaches as to the accept- 
ance of the decisions by those who must carry 
them out? (c) What implications are there 
in the two procedures for the development of 
problem-solving and managerial abilities of 
subordinates? (d) To what extent do the 
differences between the two approaches re- 
flect attitude differences as contrasted to such 
attributes as knowledge and skill? (e) What 
guides, if any, do the differences between 
these views of delegation indicate for manage- 
ment training and research? 


Method 


Subjects. The Ss were 456 supervisors attending a 
foremen’s conference. They represented several lev- 
els of management and many different industries. 

Role-playing problems. Two different manage- 
ment problems were used. Both problems have ap- 
peared in other previous publications (12, 13, 14) 
and are merely summarized here. One problem (re- 
ferred to later as the New Truck Problem) concerns 
the allocation of a new truck among the five mem- 
bers of a crew of repairmen, all of whom want the 
truck. This creates an attitude conflict among the 
members which must be resolved before a solution 
can be reached. The other problem (designated as 
the Change of Work Procedure Problem) involves 
a crew of three men on a routine assembly opera- 
tion who rotate positions periodically in order to 
prevent boredom. Meanwhile a methods study has 
revealed that if each man were to remain on the 
position for which he is best suited there will be a 
considerable saving in time per unit. However, the 
men fear boredom and mistrust the management mo- 
tives underlying the alteration in job procedure. 
Thus the problem typifies the forces involved in re- 
sistance to change. 

Role-playing procedure. Multiple Role Playing 
(15) was used as the experimental procedure. Two 
separate experimental sessions were held (one for 
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Table 1 
Solutions to Problems Developed by Supervisors Under Conditions of Limited Delegation (LD) 
and Full Delegation (FD) 
(N = 456) 


, 





New Truck Problem 

Solutions 
Assign- Giving 

ment 
of New 
Truck to 
Senior 

Man 


Varying 
Majorities 
Different 

Trucks 
(5, 4, & 3) 


21.1 
55.5 


No. of 
Groups 


4 


Zi 


10.5 
48.2 


LD groups 

FD groups 

x? (computed from 
frequencies) 7 


18 5.48 


Level of significance (.01-,001) (.01-.02) 


each problem) and the Ss in the two sessions were 
different so that practice effects were minimized. 


Problem 1 


New truck problem. Following a lecture on the 
subject of attitudes, the Ss were formed into labora- 
tory groups of approximately 40 individuals. These 
groups then met in separate rooms under the leader- 
ship of trained experimenters. When each group 
had assembled the Ss were informed that they were 
to participate in a discussion of a management prob- 
lem involving a foreman and his crew of 5 repair- 
men, and since they were going to be these men in 
the discussion, the Ss were asked to form into groups 
of six. A brief discussion was held as to the nature 
of role-playing procedures. Following this was a 
presentation of certain essential background infor- 
mation on the problem. The experimenter then gave 
one person at random in each group of six a set of 
roles (this person was thus designated as the leader 
of his group) with the instruction to retain the 
leader role (including the problem) and distribute 
the remaining five roles among the members in his 
crew. The sets of roles were the same for all crews; 
however, the individual member roles were different. 
Half of the leaders were given a written attitude in- 
struction toward deciding what would be the fairest 
solution to the problem of allocating the new truck 
and then discussing the solution with their crews. 
The remaining half of the leaders were given a writ- 
ten attitude instruction toward presenting the prob- 
lem to the crew for discussion and accepting what- 
ever solution was developed. After 25 minutes of 
interaction, all discussions were ended and the ex- 
perimenter then proceeded with the collection of the 
data and a general discussion of the results. 

Data collection procedure. Data on the following 
aspects of the solutions were obtained from all 
groups, one group at a time. 


1. Who got the new truck? 
2. What disposition was made of that person’s old 
truck [until all trucks had been accounted for]? 


Change of Work Procedure 


No. of 
Groups 


20 
25 


Both Problems 
Combined 


Problem 


Dis- 
satisfied Condi- 
Satisfied Group tional 
Leaders Members Solutions 


Estimating 
Problem Production 
Persons Increase 
84.6 
98.1 


19,7 
8.1 


23.1 
40.4 


20.0 
10.7 
2.30 13.42 6.00 


10.08 3.02 


(.10-.20) (.001-,0001) (.01-,02) (.01-.001) (.05-.10 


3. Are there any other aspects of the solution not 
already covered? 

4. Is the leader satisfied or dissatisfied ? 

5. Which crew members are dissatisfied ? 


Problem 2 


Change of work procedure. The sequence of steps 
in the experimental procedure was the same as for 
the first problem. However, the experimental period 
was preceded with a lecture on frustration princi- 
ples. In the laboratory session itself, the Ss were 
asked to form groups of four persons, as called for 
by the problem. Also the questions used in the col- 
lection of the data were different and consisted of 
the following: 


1. What is the solution in your group [asked of 
the leader] ? 

. Which of your crew members, if any, showed 
stubborn, hostile, or uncooperative reactions so 
as to create a problem in the discussion? 

. What will happen to production if the solution 
you have settled on is put into effect [asked of 
all participants with separate tabulations for 
“increase,” “decrease,” and “stay about the 
same”] ? 

. Are you [the leader] satisfied or dissatisfied 
with the solution? 

. Which crew members [if any] are dissatisfied ? 


> 


Results 


The results are shown in Table 1. The 
data which are unique to each problem are 
shown in Columns 2 through 7 and those 
which are common to both problems have 
been combined in Columns 8, 9, and 10. 

Under limited delegation (LD) the senior 
man had about one chance in ten of getting 
the new truck. However, under full delega- 
tion (FD) the new truck was assigned to the 
senior man in nearly half of the solutions. 
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Thus it seems that the values in seniority as 
a basis for assigning the new truck tended to 
mean different things under the two delega- 
tion conditions. 

When one crew member receives the new 
truck, then his. previous one must be assigned 
to a different crew member or be disposed of. 
This means that several exchanges of vehicles 
may occur. Since such exchanges are volun- 
tary they will occur when both parties feel 
they will gain in some way. The number of 
exchanges therefore may be taken as a meas- 
ure of solution quality. Viewed in this light, 
the fact that three or more of the five crew 
members received different trucks about 24 
times as often under FD as under LD sug- 
gests that the superiors were less likely to see 
the possibility of rewarding several individu- 
als than were the subordinates. A similar 
tendency is indicated with respect to the con- 
ditional solutions in Column 10. These are 
solutions which contain unique features or 
extras designed to satisfy particular needs of 
subordinates, and such solutions tended to 
occur more frequently under the FD condi- 
tion. Further, an inspection of the raw data 
reveals an interesting qualitative difference 
in that the conditions developed under LD 
tend to be in the nature of concessions ex- 
acted from the superior, and under FD the 
conditions are in the nature of constructive 
improvements to the solution. 

In Column 7 a somewhat different measure 
of solution quality is indicated for the Change 
of Work Procedure problem. These data rep- 
resent the views of superiors and subordinates 
as to whether the adoption of the new solu- 
tion will result in an increase in production 
vs. a decrease or no change. While it seems 
probable that feeling judgments of acceptance 
as well as intellectual evaluations of solution 
quality are both represented, the difference in 
proportion of those predicting a production 
increase to occur is significantly in favor of 
the FD condition. 

In the Change of Work Procedure problem 
all superiors were provided with a ready-made 
solution to the problem for presentation to 
the subordinates. Given this limitation on 
the freedom of all superiors it is of some in- 
terest to note the greater tendency for sub- 
ordinates under LD to be hostile, obstructive, 
or otherwise create a problem for the leader 


than was true under FD. In other words the 
superiors using the LD procedure apparently 
had a less pleasant experience in conducting 
their discussions than was true of those using 
the FD approach, yet were unable to improve 
things appreciably once the discussions got 
under way. Further evidence of a related na- 
ture is indicated in Columns 8 and 9 which 
shows that there were significantly fewer 
satisfied leaders under the LD condition and 
a significantly greater proportion of dissatis- 
fied subordinates than under FD. 


Discussion 


The results itidicate that a superior who re- 
serves to himself the authority to make final 
decisions may not always expect as satisfac- 
tory results as when full responsibility for 
solving certain problems is delegated to one’s 
subordinates. Regardless of how perceptive 
and fair-minded the superior may try to be, 
it appears that he may often tend to misjudge 
the importance of group values and to over- 
look various opportunities for rewarding his 
subordinate group members. Further, the in- 
dications are that the LD approach as com- 
pared to FD is more likely to generate hos- 
tility and dissatisfactioin among subordinates 
and result in a less satisfactory problem-solv- 
ing experience for the superior himself. 

In part at least, the differences in results 
appear to arise from the fact that the LD 
procedure causes the superior to take an ini- 
tial position as to what is the proper solution 
so that, in reality, he is presenting a solution 
to the group, not a problem. To the degree 
that the solution is at variance with the needs 
and ideas of the subordinates, it becomes a 
focal point for the expression of dissatisfac- 
tion and criticism. A problem on the other 
hand tends to stimulate ideas and construc- 
tive thinking towards various alternative solu- 
tions. Hence, by presenting his own views as 
a solution the superior tends to place himself 
in the position of defending a given set of 
views rather than aiding his subordinates in 
the development of new ideas for solving the 
problem. Although logic may indicate the 
desirability of altering a decision in the Jight 
of important new information, this is not 
likely to occur when a superior feels that such 
action may be interpreted as a sign of weak- 
ness or indecision. 
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From this it appears that an important 
contribution of the full delegation attitude of 
the superior is that it influences subordinates 
toward constructive solution of a problem on 
its own merits. In so doing, it helps to avoid 
any tendencies toward merely giving lip serv- 
ice to a superior’s solution, of arguing with 
him, or of doing as directed with reduced 
motivation. 

The results from this one experiment yield 
only partial answers to some of the questions 
raised earlier in this report and even these 
answers must be viewed with reservations. 
For one thing the previous supervisory ex- 
periences of the subjects may have caused 
them to react differently to the experimen- 
tal situation from other nonsupervisory em- 
ployees in industry. In addition, the delega- 
tion conditions tested may not be representa- 
tive of more than a very limited segment of 
managerial situations. Further studies for the 
exploration of these and other related issues 
are now in progress. 


Summary 
This experiment was concerned with the 
study of attitudinal influences on the dele- 
gation process. Management personnel were 


formed into groups of four and six members 
for the purpose of solving two different but 
typical industrial problems. In each group 
one member was selected at random to role 
play the part of the superior and the other 
members took the part of his subordinates 
In half of the groups the superior was given 
an attitude cue toward arriving at a decision 
and then discussing things with his subordi- 
nates, thus limiting the delegation of prob- 
lem solving. The remaining superiors were 
given an attitude cue toward presenting the 
problem to their subordinates for their solu- 
tion, and accepting whatever decision was 
made, this being termed full delegation. In 
terms of solution quality, acceptance and 
satisfaction of superiors and subordinates, 
the full delegation procedure consistently 
yielded the more satisfactory results; five of 
seven differences being significant beyond the 
2% level of confidence. The results are in- 
terpreted to mean that attitudes of super- 
vision toward the delegation process may be 
an important factor in the solution of certain 
management problems. 
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The Effects of Sound Films on Opinions About Mental 
Illness in Community Discussion Groups * 


Elliott McGinnies, Robert Lana, and Clagett Smith * 
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Research concerning the effects of mass 
communications on attitudes and opinions has 
generated a rather perplexing set of results. 
Despite common belief that the mass media 
exert a profound influence upon the manners 
and morals of recipients, evidence to this ef- 
fect is still fragmentary and controversial. 
Early investigators in the area of motion pic- 
ture films, for example, have reported not only 
immediate but persisting attitude changes fol- 
lowing exposure to a single film. The findings 
of Peterson and Thurstone (9) who used films 
to induce changes in the attitudes of children 
on such topics as nationality, crime, and war 
are typical of results reported in this area. 
These investigators further reported that a se- 
ries of films was sometimes successful in in- 
ducing attitude change where a single film 
had failed. Hoban and van Ormer (4), on 
the other hand, have examined studies done 
with Army training films and educational films 
and have concluded that a single communica- 
tion produces only a temporary effect upon 
attitudes, if any. Fearing (2), who expresses 
scepticism with respect to the usefulness of 
films for this purpose, states that “. . . re- 
search, conducted as carefully as we know 
how to conduct it, reveals that the effects of 
these media—films and radio, especially films 
—on human attitudes and behavior is unex- 
pectedly slight.” Factual knowledge, how- 
ever, as many studies have shown, can effec- 
tively be imparted by the use of instructional 
films (4). 

Evidence has also been presented to show 
that group discussion facilitates learning, atti- 
tude change, and readiness to make a decision 


1 This research was supported by a special grant 
from the National Institute of Mental Health, Na- 
tional Institutes of Health, United States Public 
Health Service. Richard Bell and Hyman Goldstein 
of NIMH took an active part in the initial planning 
stages of the project. Joseph Bobbitt of NIMH has 
been helpful at all stages of the study. 

2 Now in the Graduate Department of Social Psy- 
chology at the University of Michigan. 
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with respect to communicated material. Hov- 
land, Janis, and Kelley (5) suggest the pos- 
sibility that acceptance as well as learning 
may be affected by eliciting overt verbali- 
zations about a persuasive communication. 
They state, “When an individual verbalizes an 
idea to others he becomes more inclined to 
accept it himself.” Bennett (1) has con- 
cluded that the function of a group discussion 
is to “facilitate decision and/or the percep- 
tion of consensus. . . Somewhat more 
tangential evidence for the efficacy of group 
discussion under these conditions comes from 
a study by Timmons (11), who found that 
individuals allowed to discuss a problem in a 
small group obtained solutions superior to 
those of persons who did not discuss the ma- 
terial. 

In light of the possibility that motion pic- 
ture films may under some circumstances be 
effective persuasive devices, the present study 
was undertaken for the purpose of evaluating 
the effects of one or more mental health films 
in adult community groups. A questionnaire 
covering opinions and beliefs with respect to 
mental illness was given to groups before and 
after exposure to a single film or to a series 
of films. In order to determine whether ac- 
tive participation would facilitate any effects 
of the films, discussions were held in half of 
the groups following film presentations and 
prior to the second administration of the opin- 
ion inventory. 

The hypotheses examined in these experi- 
ments were as follows: 

1. A single mental health film presented to 
an audience without discussion will signifi- 
cantly influence opinions about mental ill- 
ness. 

2. Group discussion of a single film will 
facilitate opinion change as compared with 
the nondiscussion situation. 

3. A series of three mental health films 
presented without discussion will result in 
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greater opinion change than that generated 
by a single film under the same conditions. 

4. Group discussion following each of a se- 
ries of three films will bring about greater 
opinion change than under nondiscussion con- 
ditions. 


Experiment One 
Method 


Subjects. Six small groups totaling 76 individuals 
were formed from larger P. T. A. and child-study 
groups in Prince Georges County, Maryland. Four 
of these groups, varying in size from 11 to 18 mem- 
bers, were shown a series of mental health films. 
Two additional groups containing nine members each 
served as controls. In general, the group merbers 
were drawn from the upper middle class segment of 
the population and were fairly homogeneous with 
respect to age and education. Several of the groups 
contained both men and women, while the remainder 
were composed entirely of women. Since there is no 
evidence to indicate that sex is systematically related 
to susceptibility to a persuasive communication, no 
attempt was made to balance this factor over all of 
the groups. The mean age of the Ss was 38.8 years, 
they enjoyed on the average 2.8 years of college edu- 
cation, their mean family income was $7,800, and 
63% identified their occupation as “housewife.” On 
a nine-point self-rating scale of “familiarity with 
mental health problems and concepts” they assigned 
themselves a mean scale value of 4.5. 

Materials. The films selected for study were The 
Feeling of Rejection, The Feeling of Hostility, and 
Breakdown. The first two deal with the etiology 
of personality disturbance, while the third is con- 
cerned with institutional treatment of psychotic dis- 
orders. All three films were produced by the Na- 
tional Film Board of Canada and are widely used in 
mental health education programs. 

In order to construct an instrument that would 
enable us to assess the opinions and beliefs of the Ss 
with respect to various aspects of mentai illness, we 
first assembled a pool of 112 relevant statements de- 
rived from several sources (7, 8, 10, 12). The state- 
ments dealt in general with the etiology, perception 
and prognosis, treatment, and post-treatment percep- 
tion of mental illness. A questionnaire consisting of 
these items was initially administered to 157 students 
in psychology and sociology at the University of 
Maryland. Item analysis employing Flanagan’s ap- 
proximation method (3) justified reduction of the 
questionnaire to 72 statements. Repetition of this 
procedure with adult groups yielded a final list of 
47 items. Test-retest reliability of this form on sam- 
ples of university students and adult P. T. A. groups 
was about .86. A split-half reliability coefficient of 
.83 was obtained from an independent sample using 
the Spearman-Brown formula. Responses to the 
statements were made on a five-point rating scale 
ranging from “strongly agree” to “strongly disagree,” 
and the method of summated ratings was used to 


obtain individual scores. Following are examples of 
the types of items included on the form: 

1. It is better not to discuss a mental illness as one 
would a physical illness. 

2. Few of the people who seek psychiatric help 
need the treatment. 

3. An employer should avoid hiring someone who 
has been in a mental hospital. 

4. Nervous breakdowns are due to overwork. 

Scoring of the items was based upon the responses 
of 12 staff members and graduate trainees at the 
University of Maryland Counseling Center. A high 
score was obtained by S if his responses were in the 
same direction as those expressed by these “experts.” 
A low score indicated disagreement with this pro- 
fessional opinion. As nearly as could be determined, 
the opinions and beliefs of our panel of professionals 
coincided with the general points made in the films. 
It should be noted that we do not refer to the meas- 
uring instrument as an “attitude scale,” since we 
have little evidence that the assumptions underlying 
a true scale have been met. Experience with the 
questionnaire, however, has indicated that it does 
measure reliably certain beliefs and opinions that 
people hold with respect to various aspects of men- 
tal illness. We shall refer to the questionnaire as the 
“Mental Health Opinion Inventory.” 

The range of possible scores on the inventory was 
47-235. If S checked the middle, or indeterminate, 
position for each statement he would achieve a total 
score of 141, indicating neither agreement nor dis- 
agreement with professional opinion. A total score 
on all items of 188 would indicate agreement but not 
strong agreement with expert opinion. Since the 
group mean pretest scores ranged from 163.5 to 
184.9, it is apparent that our Ss were initially some- 
what predisposed toward professional judgment on 
the scale items. Our experience, however, has been 
that in groups of the types studied it is exceedingly 
difficult to discover opinions about mental health 
issues that are markedly naive or inaccurate. The 
individuals who would be most likely to score low 
on questionnaires of this type are precisely those 
persons who do not participate in community ac- 
tivities designed for educational purposes. In the 
present instance, the goal of the communicator is 
limited to overcoming certain misconceptions that 
exist among groups of interested persons who other- 
wise are fairly well-informed about mental health 
problems. 

A biographical questionnaire was also given to all 
Ss in the experiments so that we could determine 
whether the groups were comparable with respect to 
a number of socioeconomic criteria. As indicated 
earlier, no serious discrepancies appeared among the 
several groups in this respect. 

Procedure. Two of the groups viewed the three 
films at bi-weekly intervals. Each film presentation 
was followed by a half-hour discussion of the film 
or of related topics. The same discussion leader, a 
professional psychologist, met all of the groups in 
order to control any effects that the leader’s person- 
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ality might have upon the discussion process. In or- 
der to permit the fullest expression of individual 
opinion by the group members, a permissive or non- 
directive approach was taken by the discussion 
leader. At the outset of the meetings, the groups 
were informed that the discussions would be re- 
corded. No further mention was made of this, and 
most of the Ss later appeared oblivious to the fact 
that a tape-recording was being made. The record- 
ing apparatus was always operated from the rear of 
the room, and the microphones were strategically lo- 
cated before the Ss assembled. A fuller description 
of this procedure is reported elsewhere (6). The 
Mental Health Opinion Inventory was administered 
before the first film was shown and at the conclusion 
of the third discussion. 

The same procedure was followed for two addi- 
tional groups except that discussion of the films was 
omitted from the meetings. In order to control for 
expectancy of discussion and the possible effects of 
this upon perception of the films, these groups were 
told that they would discuss all three films at the 
conclusion of the final screening. They were allowed 
to do this only after they had completed the inven- 
tory for the second time, so that the discussion could 
have no effect upon the measurement of opinion 
change. 

Two control groups simply responded twice to the 
inventory, with a four-week interval between ad- 
ministrations. 


Results 


It had been predicted that opinions and be- 
liefs about mental illness, as reflected in re- 
sponses to the Mental Health Opinion Inven- 
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tory would be altered as a result of exposure 
to the film series. It was also hypothesized 
that opinion change would be greater in those 
groups that had discussed the films as com- 
pared with the groups that were given no op- 
portunity for discussion. Discrepancies be- 
tween scores on the pre- and posttreatment 
administrations of the inventory were taken 
as measures of opinion change. Table 1 shows 
the mean scores for all of the groups before 
and after experimental treatment. Positive 
difference scores indicate movement in the di- 
rection of professional opinion on the ques- 
tionnaire. 

Before testing for the effects of treatment 
upon opinions and beliefs about mental ill- 
ness, the difference scores for the two groups 
within each treatment were examined for het- 
erogeneity of variance and for differences be- 
tween means. In all cases, the two groups 
representing each treatment did not differ sig- 
nificantly in either of these respects. The 
within-treatment groups, therefore, were com- 
bined in assessing over-all effects of experi- 
mental procedure upon opinion change. These 
treatment means are shown in Table 1. It is 
apparent that both the film-alone and film- 
discussion groups were influenced by the ex- 
perimental conditions, while the control groups 


Table 1 


Pretest, Posttest, and Difference Scores for All Groups 


Pretest 


SD 


Group M 


Film-discussion I 163.5 18.6 
N = 16 
Film-discussion II 184.9 
N = 18 
Film-alone I 
N = 13 


174.6 
173.1 


Film-alone IT 
N = 11 


Control I 
N =9 


Control II 
N=9 


Treatment 
Mean 
Difference 


Posttest Group 
as -- Mean 
M SD Difference 


183.1 16.0 19.6 


198.0 13.1 


186.5 


191.9 
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Table 2 
Analysis of Variance on Adjusted Posttest Scores 
(Covariance Method) 


Mean 
Square I 


Sum of 
Squares 


Source 


Between 
Within 


3,562.4 1781.2 17.10 <.001 
? 


7,604.0 104 


Total 11,166.4 


made approximately the same scores on both 
administrations of the inventory. 

In order to control for differences among 
the groups in initial opinion, analysis of co- 
variance of the mean opinion change scores 
was employed. The results of this analysis 
are summarized in Table 2. Treatment ef- 
fects, as indicated in the table, were signifi- 
cant beyond the .001 level. To determine the 
specific sources of this between-treatment vari- 
ance, three analysis of covariance ¢ tests were 
performed. This test involves weighting the 
denominators of the conventional ¢ ratios by 
the coefficient of alienation of the entire sam- 
ple. Of the three possible comparisons, two 
were significant. Both the film-discussion and 
the film-alone groups differed significantly at 
the .001 level from the control groups. The 
third comparison, that between the film-dis- 
cussion and the film-alone conditions, was not 
significant at the .05 level. Participation 
through group discussion of the films did not 
enhance the effects of the films so far as 
changes in opinion scores were concerned. 

In order to examine the possibility that a 
“consensus” effect might have been generated 
in the discussion groups, even though this was 
not reflected in greater opinion change, the 
variances of pre- and posttreatment inventory 
scores were compared for the various experi- 
mental conditions. In no instances did the 
variances differ between groups, either before 
or after treatment, indicating that converg- 
ence of opinions following exposure to the film 
series was no greater under discussion than 
under nondiscussion conditions. 

The possibility now remained that one of 
the films was responsible for the obtained 
opinion changes. It had been determined, for 
example, that the groups tended to prefer the 


film Breakdown the most and The Feel- 
ing of Rejection the least. It was also con- 
ceivable that discussion had failed to sum- 
mate with the effects of the films because the 
accumulated impact of the film series pro- 
duced a maximum effect by itself. A further 
study, therefore, was designed to determine 
(a) whether one of the three films was re- 
sponsible for most of the measured effects 
upon opinion, and (6) whether group discus- 
sion would generate greater opinion change 
as compared with nondiscussion conditions 
when a single film rather than a series of films 
was used. 


Experiment Two 
Method 


Subjects. A total of 64 individuals forming six 
groups participated in this study. The groups varied 
in size from 8 to 13 members. Since they were re- 
cruited from the same types of P. T. A. and child- 
study groups as the Ss in Experiment One, they 
were similar in all important respects to the partici- 
pants in that study. The mean age of the Ss was 
38.3, they had attained an average of 2.5 years of 
college education, their mean family income was 
$7,280, and 76% identified themselves as “house- 
wives.” On the nine-point self-rating scale of fa- 
miliarity with mental health problems they achieved 
a mean scale value of 4.1. 

Materials. The films, biographical questionnaire, 
and Mental Health Opinion Inventory were the same 
as those used in the first experiment. In this first 
instance, however, the films were shown singly rather 
than as a series. 

Procedure. In order to make the interval be- 
tween administrations of the opinion inventory com- 
parable to that of the previous investigation, the first 
testing was done at a regularly scheduled meeting of 
each group. The experimental sessions were held 
ene month later, at the conclusion of which the post- 
treatment measures were taken. Each of the three 
films previously described was shown to a different 
group, the members of which engaged in discussion 
of the film with the same leader who had served in 
Experiment One. At the conclusion of the discus- 
sions they filled out the opinion inventory to which 
they had first responded four weeks earlier. 

The remaining three groups each viewed one of 
the films and were posttested on the questionnaire 
without discussion. Since the previous study had 
shown that no opinion changes could be expected in 
groups which were not experimentally treated, it was 
considered unnecessary to include this type of con- 
trol in the design. The principal concern of this 
study was to determine the effectiveness of a single 
film with and without discussion upon changes in 
opinions and beliefs about mental illness. Since the 
same three films were used as in the first experiment 
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Table 3 
Pretest, Posttest, and Difference Scores for All Groups 








Pretest 


Group M 


Posttest 
M SD 


Mean 
Difference 





Film A®*-discussion 
N = 13 


Film A-alone 
N=8 


171.4 18.9 6.1 


22.2 





Film B>-discussion 
N = 13 


Film B-alone 
N=8 








Film C*-discussion 
N = 11 


Film C-alone 
N= 11 





*Film A = The Feeling of Hostility. 
> Film B = Breakdown. 
¢ Film C = The Feeling of Rejection. 


and were randomly assigned to the participating 
groups, it was expected that any differences in their 
relative effectiveness would appear in the pre- and 
posttreatment opinion measures. Any interaction be- 
tween number of films shown and opportunity for 
discussion by the audience members would be re- 
vealed in differences between the film-alone and film- 
discussion groups, no such differences having been 
obtained with a three-film series. 


Results 


The initial and posttreatment scores of the 
various groups, together with the mean dif- 
ference scores, are shown in Table 3. It will 
be noted from the table that the range of 
mean group pretest scores is somewhat less 
than in Experiment I, ranging in this instance 
from 155.0 to 169.9. The mean score for 
these Ss is also somewhat lower than for those 
participating in the first study. Since these 
individuals were further from the ceiling of 
the measuring instrument than the Ss in the 
prior experiment, they might have been ex- 
pected to change more readily under persua- 
sive influence, which in this case consisted of 
a single film rather than a series of three 
films. 

In order to discover whether any over-all 
differences existed among the various groups, 


an analysis of covariance was done on the 
opinion-change scores. The F test, as indi- 
cated in Table 4, was not significant at the 
.OS level, and the null hypothesis of no dif- 
ferences among the six groups is accepted. 
To determine whether any of the mean dif- 
ference scores from pre- to posttreatment 
were significantly greater than zero, six analy- 
sis of covariance ¢ tests were performed. Only 
one of the groups, the film-alone group view- 
ing Breakdown, showed a significant change 
in mean opinion score following treatment. It 
would be unjustified on the basis of this one 
significant finding to conclude that a single 
mental health film, with or without discus- 
sion, is capable of influencing opinions about 


Table 4 


Analysis of Variance on Adjusted Posttest Scores 
(Covariance Method) 








Mean 
Square P P 


Sum of 
Squares: 





992.4 
10,774.3 


11,766.7 


198.5 
189.0 
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mental illness. Noteworthy, however, is the 
fact that five of the six groups showed opin- 
ion-change scores in the direction supported 
by the films, even though just one of these 
changes is significant. The insignificant F 
test for between-group treatments indicates 
that the discussions added nothing to the 
measured effectiveness of the films in the 
present study. 


Discussion 


Considering the results of both experiments, 
it is apparent that only one of our original 
hypotheses has been confirmed, namely, that 
a series of three mental health films presented 
without discussion will result in greater opin- 
ion change than that generated by a single 
film under the same conditions. In fact, only 
one of three films shown singly was effec- 
tive in modifying scores on a questionnaire 
dealing with opinions about mental illness, 
and this was under nondiscussion conditions. 
While a series of three films proved useful in 
bringing opinions about mental illness more 
in line with professional thinking, discussions 
following each film in the series failed to aug- 
ment this effect. 

What do these findings imply for the use of 
mental health films in educational programs 
as well as for questions concerning the sus- 
ceptibility of opinions and attitudes in gen- 
eral to influence through motion pictures? 
For one thing, the findings in these two studies 
suggest strongly that a series of films dealing 
with a common topic may be effective modi- 
fiers of opinions where a single film is likely 
to produce no measurable result. It should 
be noted, of course, that none of the films 
used here had a running time of more than 
40 minutes. The rather dramatic results in 
attitude change obtained by Peterson and 
Thurstone (9) may be attributable in part to 
the fact that they employed full-length Holly- 
wood productions with considerable emotional 
impact. Educational-indoctrination films, such 
as those used by the armed forces and those 
in the present study, are generally limited in 
both scope and dramatic appeal. It is not 
surprising, then, that the effects of these more 
conservative productions upon attitudes and 
beliefs are often difficult to detect, and that a 


single presentation is apt to produce no meas- 
urable changes. 

The failure of active participation to fa- 
cilitate opinion change in the two investiga- 
tions reported here may be due to several 
factors. First, the discussions were per- 
mitted to develop along any lines suggested 
by the group members. In general, the discus- 
sions centered about the characters and plots 
of the film rather than about mental health 
problems in general, so that generalization 
from the films to the more discursive items 
on the questionnaire was probably minimized. 
A second and more probable explanation for 
the failure of the discussions to implement 
the impact of the films is that the films by 
themselves induced as much opinion change 
as might reasonably be expected in audiences 
of this type. The participants in these stud- 
ies were all of superior educational and eco- 
nomic status, as are most active members of 
community groups formed voluntarily for self- 
education. Consequently, their initial reac- 
tions to the questionnaire items were oriented 
in the direction of those expressed by our 
panel of professionals who provided us with 
anchoring points for the scoring system. That 
the group members moved further in this di- 
rection following treatment is a tribute to the 
effectiveness of the films; to expect even 
greater change as a result of group discus- 
sion is perhaps unreal. 

It should not be concluded from these find- 
ings that group discussion has no salutary ef- 
fects in conjunction with film presentations. 
There was noticeable discontent among some 
members of the film-alone groups, who mildly 
resented being dismissed without an oppor- 
tunity to talk about the film that they had 
just seen. Even the promise of an organized 
discussion at the conclusion of the film se- 
ries did not completely allay these complaints. 
It has been our experience that community 
groups welcome a chance for discussion of 
films of this type; but it also is important to 
note that the instructional value of the film 
does not seem to rest upon a related discus- 
sion. Whether discussions of a different type, 
for example, those held under a directive 
leader, would be more effective in influencing 
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attitude change remains a subject for experi- 
mental determination. 

A final comment with respect to certain 
methodological problems encountered in re- 
search of this sort may be useful. Sampling 
is necessarily subject to some serious limita- 
tions. It is virtually impossible to assign in- 
dividuals randomly to treatments when deal- 
ing with adults who are under no compulsion 
to appear at scheduled times or to meet in 
inconvenient places. Groups must be located 
and persuaded to participate in the research 
project, and it is frequently difficult to sched- 
ule a series of meetings with the same indi- 
viduals in attendance. While it would have 
been highly desirable to include more groups 
under the several conditions of these experi- 
ments, these practical considerations militated 
against such a procedure. Statistical con- 
trols, therefore, must frequently be exerted 
where experimental controls are lacking. 

Despite these limitations, we feel reason- 
ably confident in concluding: (a) that mo- 
tion picture films shown in a coherent series 
can significantly modify opinions and beliefs, 
and (0) that a series of mental health films 
shown with or without audience participation 
through organized discussion are effective in 
changing opinions and beliefs about mental 
illness. 


Summary 


Two experiments were designed to evaluate 
the hypotheses that one or more sound mo- 
tion picture films would modify the opinions 
and beliefs of audience members, and that 
group discussion of the films would augment 
this effect. Participants in the studies were 
members of adult community groups. Opin- 
ions were measured before and after experi- 
mental treatment by means of a 47-item ques- 
tionnaire containing statements about mental 
illness and scored by the method of sum- 
mated ratings. 

Results of the two investigations indicated 
that a single mental health film did not pro- 
duce significant changes in opinions toward 
mental illness in groups, regardless of whether 
or not the groups engaged in discussion of 
the films. A series of three films, however, 
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induced significant shifts of opinion in the di- 
rections intended by the film content. De- 
gree of opinion change was no greater in 
groups which had discussed the films than in 
groups which had not held discussions. 

The findings have been discussed in terms 
of the types of films employed as well as the 
characteristics of typical audiences for which 
these films are intended. 


Received April 8, 1957. 
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The Influence of Noxious Environmental Stimuli on Vigilance’ 
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The problem of vigilance or sustained at- 
tention to randomly occurring, obscure sig- 
nals has received much attention in recent 
years. Many studies (1, 2, 5, 10) have been 
concerned with the asymptotic decline in per- 
formance with the passage of time, and most 
theoretical formulations (3, 6, 10) have dealt 
with this aspect of the problem. A decline in 
vigilance has also been observed as a func- 
tion of exposure to noxious environmental 
stimuli, especially noise and heat (4, 8, 10, 
11). Decrements in vigilance due to heat 
have not been reported as consistently as 
decrements due to noise. 

A deterioration in performance under the 
influence of noxious environmental stimuli 
might be attributed to a change in the 
physiological state of the individual, to the 
elicitation of responses incompatible with the 
detection of stimuli or the reaction to them, 
or to changes in motivation. In practice it is 
difficult to resolve these alternative explana- 
tions. The present study was designed sim- 
ply to investigate the influence of combined 
noise and vibration, of combined heat, noise, 
and vibration, and of heat alone upon the per- 
formance of a simple monitoring task. 


Procedure 


Task. The monitoring task employed was 
similar to Broadbent’s “Twenty Dials” (4). 
The S was seated on a bench in an Army 
troop carrier (an armored, tracked truck) in 
front of a board bearing 20 numbered dials. 
The dials, approximately two inches in di- 
ameter, were arranged in two equal and 
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2 Now at SAGE Operator Research Unit, Operator 
Laboratory, Air Force Personnel and Training Re- 
search Center, L. G. Hanscom Field, Bedford, Mas- 
sachusetts. 

3 Now at Ohio State University. 


aligned horizontal rows. A pointer on each 
dial normally pointed to a position within an 
arc, circumscribed by two black marks and 
outlined in blue. 

The S held a rectangular response board on 
his lap. On it were 20 push buttons, num- 
bered and arranged spatially like the dials on 
the display board. It was S’s task, whenever 
a pointer moved outside the circumscribed 
arc, to push the button corresponding to the 
dial. Response times in milliseconds were 
obtained and recorded by EZ. Pointers moved 
at random intervals ranging from one to seven 
minutes. A complete session under experi- 
mental condition consisted of 49 trials and 
lasted for three hours and forty-five minutes. 

Conditions. Every S performed on the task 
four times, under a different environmental 
condition each time. In the stationary night 
(control) condition, the troop carrier was sta- 
tionary, the temperature varied between 65 
and 75 degrees Fahrenheit, the ambient noise 
level varied from 65 to 75 decibels re .0002 
dyne per square centimeter, and there was no 
appreciable vibration. In the moving night 
(noise and vibration) condition, the tempera- 
ture was the same, the noise level was 115 to 
125 decibels, and there was considerable vi- 
bration. (The vehicle noise was random and 
fairly flat up to approximately 5,000 cycles 
per second. No adequate means of measur- 
ing or analysing vibration was available.) In 
the day moving (heat, noise, and vibration) 
condition, the noise and vibration were ap- 
proximately the same as in the night moving 
condition, but the temperature ranged from 
110 to 125 degrees. In the day stationary 
(heat) condition, noise and vibration were 
comparable to that in the night stationary 
condition, but the heat was comparable to 
that in the day moving condition. Humidity 
in all conditions was between 4% and 24%. 

All of the experimental data were collected 
at the Yuma Test Station, Arizona. The 
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vehicle moved over a level, pre-established 
course. 

There were 24 possible sequences of the 
four environmental conditions. These were 
arranged in random order and Ss were as- 
signed to them in order of their arrival. 

Subjects. Twelve Ss were recruited from 
the personnel of the Army Medical Research 
Laboratory. They were screened for good 
auditory acuity and generally good physical 
condition. A prize of twenty dollars was of- 
fered to the S with the best overall perform- 
ance on the task. 


Results 


For purpose of analysis the 49 trials were 
divided consecutively into seven blocks of 
seven trials. Each block represented an in- 
terval of approximately 32 minutes. It was 
decided that the median would best represent 
the average performance for each block. 

Figure 1 pictures the results diagrammati- 
cally. It is apparent that in the moving con- 
ditions (which involved noise and vibration), 
the response times were greater than in the 
stationary conditions. 


Table 1 summarizes the analysis of vari- 


ance. For this analysis the median values 
previously discussed were normalized by a 
logarithmic transformation. The variances be- 
tween blocks of trials, between environmental 
conditions, and between Ss were all much 
greater than would be expected by chance. 
The interaction of Ss with conditions was sig- 
nificant. The interaction of conditions and 





Fic. 1. Median response times on successive blocks 
of trials under different experimental conditions. 


Table 1 


Analysis of Variance of Log Median Response Times 
on Successive Blocks of Trials 








Mean 
Square 
0.172 
2.600 
0.551 
0.077 
0.209 
0.052 
0.048 


*P < 0S. 
eP < A. 


Source df 





Blocks of trials 6 
Conditions 3 
Subjects 11 
CXB 18 
cxs 33 
SxXB 66 
CxXBXS 198 


blocks fell short of significance at the 0.05 
level. 

A further breakdown of the analysis re- 
vealed that the significant variance between 
environmental conditions was largely attrib- 
utable to the difference between the moving 
and stationary conditions. The difference be- 
tween the means of the day and night condi- 
tions was not significant. Variance between 
blocks of trials was largely attributable to the 
large increases in median response times on 
the second and third blocks of the day mov- 
ing condition (see Figure 1). These increases 
also explain the nearly significant interaction 
of conditions and blocks of trials. 

Discussion 

The large, very significant increases in re- 
sponse times in the moving conditions pre- 
sumably reflected the influence of exposure to 
noise or vibration or both. The data afford 
no information as to the relative contribution 
of these stimuli to the observed performance 
decrement. These changes, as well as the 
nonsignificant overall temporal changes, cor- 
roborate Broadbent’s findings (4). 

It is especially interesting that when Ss 
were exposed to heat alone, in the day sta- 
tionary condition, no significant decrement re- 
sulted, but that when Ss were exposed to heat, 
noise, and vibration in the day moving con- 
dition, a significant though transitory addi- 
tional decrement was produced. 

No final conclusion as to the nature of the 
effects may be drawn. Conceivably the dif- 
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ferential effects of the noise and vibration 
and the heat might reflect the presence of dif- 
ferent physiological or psychological mecha- 
nisms, but this is by no means certain. A 
more crucial exploration of the problem 
should be undertaken. 


Summary 


Twelve Ss in an Army troop carrier were 
asked to detect and respond to obscure, ran- 
domly occurring signals under each of four 
field conditions. In the control condition, 
the noise and heat levels were moderate and 
the vehicle was stationary. During the noise 
and vibration condition, the vehicle was mov- 
ing, noise and vibration were considerable, 
and the temperature was moderate. In the 
heat condition, the heat was rather intense, 
the noise was moderate, and the vehicle was 
stationary. During the heat, noise, and vi- 
bration condition, the vehicle was moving, 
and the noise, vibration, and heat levels were 
rather high. 

Noise and vibration produced by the mov- 
ing vehicle appreciably increased the median 
response times of the Ss. Further decrement 
occurred when heat was combined with noise 


and vibration, but the effect was relatively 
transitory. Heat alone had no apparent ef- 
fect. Changes occurring as a function of 
elapsed time were not apparent. 
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The problem with which the present study 
is concerned is that of selecting and placing 
individuals in supervisory positions in the 
technical operations or departments of a com- 
pany. Such operations would include the de- 
sign and development of mechanical and proc- 
ess equipment, production supervision, and 
the study and improvement of production 
methods. An analysis of the duties and re- 
sponsibilities of men working in these areas 
suggests that some quality which might be 
termed mechanical aptitude should be a fac- 
tor in successful performance on the job. 
Specifically, the object of the present investi- 
gation was to determine whether or not the 
qualities measured by a widely used (1) test 
of mechanical aptitude, Mechanical Compre- 
hension Test—Form CC (10), are related to 
successful performance in these activities. 

Some degree of success has been attained in 
finding relationships between mechanical apti- 
tude test scores and job performance (3, 6, 
11) where the performance is largely motor. 
However, the existence of a special aptitude, 
as distinct from general aptitude or intelli- 
gence, for performing in more complex types 
of situations such as those faced by the engi- 
neer has not been satisfactorily demonstrated. 
In 1942 Bennett and Cruikshank (2), after a 
review of the findings of research workers 
concerning the validity of mechanical apti- 
tude tests in predicting engineering school 
success, concluded that “Some of them ap- 
proach some degree of usefulness, but in gen- 
eral the tests used are not thoroughly depend- 
able touchstones for predicting engineering 


1 The author wishes to express his appreciation to 
Richard S. Uhrbrock for his kind suggestions and 
help in the formulation of the problem and treat- 
ment of the data. 





school success.’ A review of the studies since 
then suggests that this evaluation may still 
apply today. 

Owens reports that Form CC of Mechani- 
cal Comprehension Test was developed with 
the intent that it would be a test of me- 
chanical comprehension with items of suffi- 
cient difficulty to measure higher levels of 
mechanical aptitude, i.e., the degree of apti- 
tude needed for successful performance in en- 
gineering courses or in engineering positions 
after graduation (9). On the basis of the re- 
sults obtained when the test was adminis- 
tered to 725 incoming freshmen, Owens con- 
cludes that the Form CC scores were making 
a significant independent contribution in the 
prediction of engineering school grades. Al- 
though this conclusion may be supported by 
the obtained data, it seems that the amount 
of this contribution was so small that con- 
sideration of scores on the test in addition to 
general aptitude test scores in selecting stu- 
dents or employees might not be justifiable 
considering the added time and expense in- 
volved. 

In another study designed to determine the 
relationship between scores on Mechanical 
Comprehension Test (Form CC) and grades 
in engineering school (7), using 130 freshmen 
students, a product-moment correlation of 
about + .40 was obtained with first year 
grades. No attempt was made to determine 
the degree to which these scores were meas- 
ures of general aptitude, i.e., intelligence. 

Assuming that, as Owens (9) suggests, 
Form CC does measure a higher level of me- 
chanical comprehension or aptitude, it seems 
logical to hypothesize that scores on this test 
should be related to performance in a tech- 
nical or engineering position in industry. 
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The purpose of the present investigation is to 
determine whether or not such a relationship 
exists. The procedure will involve an item of 
analysis of the test in question using both in- 
ternal and external criteria. 


Procedure 
The Test 


The test used was the Mechanical Comprehension 
Test—Form CC (10). This form contains 60 multi- 
ple-choice items designed to sample the examinee’s 
ability to understand mechanical relationships and 
the results of forces acting upon objects. For ex- 
ample, in some of the items the examinee is pre- 
sented with drawings showing different possible de- 
signs of a mechanical device and asked to identify 
the one which would give the greatest mechanical 
advantage. The test was administered to the Ss of 
the present study in groups of 25 or less and ac- 
cording to the directions of the authors of the test 
(10). The Ss indicated their answers on IBM an- 
swer sheets designed specifically for this test. The 
score used was the total number right. 


The Subjects 


The Ss of the investigation were 208 members of 
supervision in a large manufacturing organization 
All were working in the departments of the com- 
pany concerned with manufacturing or applied re- 
search. The divisions of the company represented 
by the Ss were production supervision, industrial en- 
gineering, and equipment design. All Ss were col- 
lege graduates employed by the company during the 
ten-year period prior to the study. The mean length 
of service at the time of the study was approxi- 
mately four and one-half years. Eighty per cent of 
the Ss held degrees in engineering or the physical sci- 
ences. Twenty per cent of the men who were en- 
gaged in the industrial engineering and production 
activities had liberal arts degrees. All the Ss had 
been hired according to similar standards in a highly 
developed selection program. The selection pro- 
cedure included a test of general aptitude developed 
by the company. Most of the men had been hired 
immediately upon graduation from college. All had 
participated in well planned, on-the-job training 
programs to prepare them for their particular duties. 


The Criterion 


The measure of performance used in the present 
study was a rating of each S by his immediate su- 
perior. The rating scale used was designed and pub- 
lished by a consulting organization for the purpose 
of evaluating the performance of supervisory per- 
sonnel (8). It consists of a series of 60 scaled state- 
ments describing a supervisor's performance. In re- 
sponse to each statement the rater indicates whether 
or not it applies to the ratee by checking either “Yes 
or True” or “Not True at Present.” The statements 
are so worded that in one-half the items “Yes or 


True” is the favorable response while in the other 
half “Not True at Present” is favorable. The state- 
ments deal with such areas of performance as pro- 
ductivity, dependability, accuracy of work super- 
vised, relationships with associates, etc. In the scor- 
ing of the scale the statements are weighted with 
values ranging from 1 to 3. These weights were 
assigned to the various statements by the authors of 
the scale according to the statements’ power to dis- 
criminate between good and poor supervisors in the 
original standardization group. The criterion meas- 
ure for each S in the present study was the total 
raw score, ie., the total of the weights of the items 
checked favorably, on the rating scale. Since this 
measure includes evaluations of performance in so 
many areas of the position it is undoubtedly fac- 
torially complex. This would tend to reduce its 
correlation with any other measure. However, in 
the opinion of the present author the technical or 
engineering competence of the individual should be 
a major factor in his overall success on the types of 
work being performed by the present Ss. 

As a further check on the acceptability of this 
scale the present author computed a split-half re- 
liability coefficient based upon the 208 rating scales 
used in this investigation. Scores on the first 30 
statements in the rating scale were correlated with 
those for the second half of the rating scale. Ap- 
plication of the Spearman-Brown formula resulted 
in a corrected reliability coefficient of + .8986 (5). 
This was considered acceptable. 


Results 


The criterion measures (weighted scores on 
the rating scales) ranged from 16 to 107 with 
a mean score of 71.13 and an SD of 21.25. 
The. raw scores on Mechanical Comprehension 
Test ranged from 19 to 60 with the mean at 


48.81 and an SD of 6.81. The computation 
of a product-moment correlation between the 
criterion measures and test scores yielded an 
r of .074. This coefficient is not significant 
at the 1% level of confidence (5). 

Item validity was measured by the biserial 
r between the response to each item and the 
criterion measures. The numerical values of 
the item validity coefficients ranged from .00 
to .37 with the median coefficients (without 
regard to sign) being .08. Coefficients of cor- 
relation significant at the 1% level of confi- 
dence were obtained for fourteen of the 60 
items on the test. These items and their va- 
lidity coefficients are given in Table 1. A 
significant negative correlation was obtained 
for three of the items. 

The obtained biserial r’s between individual 
items and total scores on the test were all 
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Table 1 


Item Validity Coefficients 
Significant at 1% Level 








Item No. Item Validity 





—.18 
37 





positive and ranged from .85 to .01 with the 
median coefficient being + .50. Only two of 
the items, numbers 7 and 38, failed to show a 
significant positive correlation with the total 
score. 

The item difficulty measures ranged from 
1% to 84%, i.e., the easiest item was missed 
by 1% of the group while the most difficult 
was answered incorrectly by 84% of the Ss. 
The median item difficulty was 13. 

The answer sheets were rescored on the ba- 
sis of the eleven items for which significant 
positive correlations with the criterion were 
obtained. A product-moment coefficient of 
correlation computed between total number 
right of the eleven items and the criterion 
measures yielded a coefficient of .31. This is 
significant at the 1% level of confidence. 

Discussion 

The obtained r of .074 between perform- 
ance measures and scores on Mechanical Com- 
prehension Test suggests no basis for assum- 
ing that relationship exists between the quali- 
ties measured by the total score on this test 
and success as a supervisor in a technical in- 
dustry. The fact that the validity coefficients 
obtained for the majority of the individual 
items are not significant and that the 14 co- 
efficients which are significant are low fur- 


ther supports the conclusion that the meas- 
ures are unrelated. When the answer sheets 
of the Ss were rescored for total number right 
for the eleven items with positive validity co- 
efficients significant at the 1% level and these 
scores correlated with criterion measures, the 
obtained r was + .31. This is significant well 
beyond the 1% level. Since there are only 11 
valid items it does not seem justifiable to sug- 
gest that the test be used in its present form 
and scored on the basis of this small number 
of items alone, especially without cross vali- 
dation of these items with another group of 
Ss. The results obtained in the present study 
suggest that before the test could be recom- 
mended for use in industrial situations simi- 
lar to those of the present study the authors 
should conduct further research to determine 
what the characteristics of the valid items 
are which cause them to select satisfactorily. 
In particular, items 3, 7, and 49, which show 
significant negative correlations, should re- 
ceive attention. 

The measures of item difficulty suggest that 
the test is too easy for a group of Ss having 
the training and background of those used in 
the present study. It is true that the Ss of 
this study were college graduates who had 
been carefully selected before employment on 
the basis of general ability and personal char- 
acteristics. However, if mechanical aptitude 
exists independent of these factors the test 
should have shown some relationship to per- 
formance on the job. 

The fact that moderate to high correlations 
with total score were found for most of the 
items in Mechanical Comprehension Test sug- 
gests that the test is consistently measuring 
some quality. An inspection of the items sug- 
gests the hypothesis that the quality may be 
a type of nonverbal intelligence or judgment. 
Studies such as those of Bruce (4) and Sar- 
tain (12) would seem to support this. In 
both of these cases significant positive cor- 
relations were obtained with Form AA of 
Mechanical Comprehension Test and tests 
of mental ability. A reasonable further hy- 
pothesis would be that, based on the fact the 
items were too easy for the present Ss, the 
test would act as a measure of general intelli- 
gence at lower levels. This possibility might 
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account for the correlations which have been 
obtained between scores on the test and prog- 
ress in training programs and on certain types 
of jobs (3, 11). 

A percentile table constructed for the Ss 
used in the present study did not appear to 
differ markedly from the one given by Owens 
and Bennett for college seniors in engineering 
courses (10). The mean raw score for the 
engineering school seniors is 47.00 while the 
mean for the present group of Ss is 48.81. 
The 75th percentile for the college seniors is 
at a raw score of 51 while that for the Ss 
would be approximately 54. The 25th per- 
centile for seniors is at raw score 43 while 
that for the Ss would be about 46. There is 
no indication that there is a substantial dif- 
ference between the college seniors who were 
members of the standardization group and 
the present group of Ss in terms of the qual- 
ity being measured by Mechanical Compre- 
hension Test. 


Summary and Conclusions 


Two hundred and eight members of su- 
pervision employed in the technical operations 
of a large manufacturing organization took 
Mechanical Comprehension Test (Form CC) 
and were rated on performance as a super- 
visor by their superiors. The statistical 
analyses of the data included measures of 
overall test validity and an item analysis with 
measures of item difficulty, item validity, and 
internal consistency. Test records were re- 
scored on the basis of the 11 items found to 
be significantly related to the criterion. These 
scores were then correlated with the criterion 
measures. A percentile table based upon the 
population of which the present Ss are a sam- 
ple was constructed. ; 

Based upon the data of the present investi- 
gation the following conclusions seem war- 
ranted. 

1. The qualities measured by Mechanical 
Comprehension Test (Form CC) are not re- 
lated to successful performance as a super- 


visor under the conditions of the present 
study. 

2. Form CC of Mechanical Comprehension 
Test is not sufficiently difficult for use with 
college graduates having the training and 
backgrounds of the Ss of the present study. 

3. The items of this test are consistently 
measuring some quality. It is hypothesized 
that this is an aspect of general intelligence 
or judgment rather than anything “mechani- 
cal.” 
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As reported by Forlano (3) and by Teich- 
ner (7) studies of the effects of cold envi- 
ronments on the simple reaction time (RT) 
suggest that RT is not affected by low ambi- 
ent temperatures down to — 50° F. How- 
ever, temperature is only one of the factors 
which make up cold environments. The cool- 
ing power of air actually depends on both its 
temperature and speed of movement (wind- 
speed). The effect of each of these, singly 
and in combination, must be studied before 
safe generalizations can be made about the 
effect of the cold on RT. Further, the com- 
bined action of temperature and wind (wind- 
chill?) in determining the cooling rate of 
exposed bodies has been formulated quantita- 
tively; thus, there is a basis for a rational ap- 
proach to the combined effects problem. It 
was the concern of the present investigation, 
therefore, to study the effects of the cold on 
RT through variation of all three physical 
factors. 

As long as Ss wear protective clothing, as 
they have in previous studies, S-R relation- 
ships may be misleading. That is, with no 
information beyond the stimulus conditions 
and S’s response, it is not possible to deter- 
mine whether the environment was actually 
effective in cooling the body. Failure to find 
a temperature effect in previous studies may 
have been the result of lack of actual body 
cooling. Thus, studies which fail to measure 


1 Now at the University of Massachusetts. 

2 Windchill is a measure of that part of the total 
cooling of a body due to the action of wind. The term 
is not usually applied to temperatures above freezing. 
Values of windchill used in this study were obtained 
from reference (6) based on Siple and Passel’s (5) 
formula: 


Ko = (Vwv + 100 + 10.45 — wv) (33 — T) 
where: 


Ky = Total cooling in kilogram calories per square 
meter per hour, 

wo = Wind velocity in meters per second, 

T, = Air temperature in degrees centigrade. 





body cooling cannot yield information of gen- 
eral value nor are the results amenable to 
theoretical considerations, either physiologi- 
cal or psychological. The present study was 
designed, therefore, to obtain body surface 
temperatures for relationship to the effects of 
cold environments. 


Method 


Six hundred and forty infantrymen from Fort 
Devens were used as Ss. 20-man groups were used, 
one per day until the total number was exhausted. 
On arrival at the laboratory the 20 Ss were ran- 
domly sorted into five-man subgroups and each op- 
eration of the investigation was phased to handle 
the sequential appearance of the four subgroups. 
Two subgroups were studied before the noon meal 
and two after it. Twenty Ss were eliminated for 
medical reasons prior to starting. 

Ss were taken to a dressing room (55-60° F.) 
which interconnected with the climatic chamber, 
where they undressed, a multi-point thermocouple 
harness was put on them and they were dressed in 
appropriate clothing. These procedures were per- 
formed “by the numbers” so that all five Ss were 
dressed at the same time, thus avoiding individual 
overheating. While Ss were in the dressing room 
standard instructions were read to them which ex- 
plained the details of the procedure to follow. When 
dressed, they were taken into the climatic chamber 
which was pre-set for the appropriate environmental 
conditions. 

In the chamber, the five Ss sat side by side about 
three ft. apart, before a long table in front of a 
large observation window. They faced sideways to 
the direction of air movement and were in front 
view of technicians operating the equipment out- 
side of the chamber. From their positions, however, 
Ss were unable to observe the operation of the 
equipment. 

Ss sat quietly and cooled for the first 25 min. 
During this time the instructions were read again 
and procedures demonstrated. After this they per- 
formed on a manual dexterity task for about 20 min. 
On completion of this task (45 min. of exposure), 
Ss were seated and 25 successive RTs were obtained. 
This procedure was completed in 7-10 min. Follow- 
ing this, Ss ran in place slowly for three min. (mild 
exercise), performed five min. more on the psycho- 
motor task and then 10 more successive RTs were 
obtained. 

Each S was provided with a Morse key fastened 
to the table. At a verbal ready signal, Ss closed the 
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keys with their preferred hands. At the reaction 
signal which was provided by 100 w. lights mounted 
opposite them, they removed their hands from the 
keys as quickly as possible and rested them on the 
table. Standard Electric Timers provided a .01 sec. 
recording of the times between the simultaneous clos- 
ing of each of the five simple circuits (onset of 
lights) and the individual reopening of each circuit 
as the Morse keys were released. 

Ten thirty gauge copper-constantan thermocouples 
were taped to different parts of the body of each S. 
The output of each thermocouple was recorded by a 
Leeds and Northrup recording potentiometer; they 
were also automatically weighted according to the 
percentage of total body surface area each repre- 
sented, integrated electronically and recorded as a 
measure of mean weighted skin temperature. The 
ten thermocouple placements and their respective 
percentage weights are shown in Table 1. In view 
of the lack of familiarity of the Ss with the situa- 
tion, it was not deemed advisable to obtain rectal 
temperatures although these would have been highly 
desirable. 

Table 1 
Placement of Thermocouples and Associated 
Percentage Weights * 
Position Weight 
Instep 050 
Calf 150 
Lat. thigh 
Med. thigh 
Back 
Chest 
Upper arm 
Lower arm 
Hand 
Cheek 


10 
"Mean Weighted Skin Temperature = 2 (Position 
i 


X Weight). 


The output of the thermocouples was recorded in 
sequence at a rate which provided a complete de- 
scription of each S’s skin temperature once every 
four min. In addition, the output of an electronic 
analog-to-digital computer ® working off the arma- 
ture of the potentiometer was fed to a No. 523 IBM 
Summary Punch. Thus, the skin temperatures were 
immediately available for IBM processing. The po- 
tentiometer recordings were used as a means of ob- 
serving the skin temperatures of the Ss. The Ss 
were removed from the experiment as soon as pos 
sible after an extremity dropped to 38° F. 

The experimental plan called for a 2 X 5 factorial 
of temperature and windspeed, a number of tem- 
peratures at constant windspeed and two groups of 


5G. M. Giannini, Datex Digital Encoding System. 


Ss at 60° F., one lightly clothed (fatigues) and one 
group nude (shorts and socks). Other than these 
two groups, all other Ss wore a complete standard 
Arctic uniform. Difficulties in keeping Ss safely 
above frostbite level at the higher windchills re- 
quired some modifications in experimental plans. 
The actual conditions used and the numbers of Ss 
who started and finished are shown in Table 2. 


Results 


Data were used only from Ss who com- 
pleted the experiment. Each RT was trans- 
formed to its reciprocal and all results were 
treated in terms of the transformed measure. 
Since this was a reciprocal of time, it may be 
thought of as an index of speed of reaction 
and will be called reaction speed (RS). In- 
spection of successive reactions in each group 
did not suggest any trend within the reaction 
series, either one suggesting performance in- 
crements or decrements. For this reason the 
mean RS was obtained for each S for the 25 
responses in the first series. These values 
were used as the basis for determining all 
effects. 

A plot of RS vs. ambient temperature at 
constant windspeed of five mph showed very 
little variation among the mean values. An 
analysis of variance of the temperature effect 
based on these data yielded an F ratio of less 
than 1.00 which confirms the conclusion that 
these temperatures had no significant effect. 

An analysis of variance of the 2 x 4 fac- 
torial represented by temperatures of — 15° 
F. and — 35° F. at windspeeds of 5, 10, 15, 
and 20 mph is presented in Table 3. The 
unequal frequencies of this factorial were 
treated as described by Rao (4); the sum- 
mary table also follows Rao. Evaluation of 
the wind-temperature interaction mean square 
provided by Table 3 indicates that these two 
factors did not interact significantly in their 
effects. An F of 12.21 was obtained for the 
temperature effect and of 4.88 for the wind 
effect. Both of these are significant at less 
than the .01 level of risk. Thus, it may be 
concluded that both the temperatures and 
the winds involved in this analysis had sig- 
nificant, independent effects on RS. 

When the results summarized in Table 3 
are considered along with the finding of no 
temperature effect at the 5 mph windspeed, 
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Table 2 
Experimental Conditions 











Ambient 


Temperature Wind chill 
e F) 


Wind d 
(mph) 


(Kg. Cal./m.*/hr.) 


No. Subjects 
Finish 


Clothing 


Start Conditions 





60 
60 
30 

0 


780 
1,166 
1,359 
1,609 
1,765 
1,873 
2,018 
1,488 
1,617 
1,914 
2,100 
2,288 


—15 
—15 
—15 
—15 
—25 
—35 
—35 
—35 
—35 


Noe 


~~ — 
BaSuunsSBansuuuun 





an interaction of wind and temperature is 
suggested, but it is one which is not statisti- 
cally testable in the present experiment. To 
examine this possibility further, Fig. 1 was 
prepared. This figure shows the effects of 
windspeed at — 15° F. and — 35° F.; it pre- 
sents the mean values for the present data 
upon which the analysis of Table 3 was based, 
and, in addition, it presents the result obtained 
with the 30 mph wind at — 15° F. Inspec- 
tion of this figure reveals that the differences 
between the effects of the two temperatures 
were large except at 5 mph where it is known 
that the indicated difference is not reliable. 
Both trends show decreasing RS with increas- 
ing windspeed, the curve for — 35° F. drop- 


59 59 
118 118 
100 100 

40 40 

39 39 

40 40 

38 37 

40 27 

18 3 

17 17 

20 19 

19 19 
37 25 

35 12 


Fatigues 
Nude 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 
Arctic 


ping more rapidly. However, this trend ap- 
pears to flatten off or actually rise a little 
after the initial large drop. This, as well as 
the fact that the other trend appears some- 
what positively accelerated, does suggest an 
interaction of temperature and wind. How- 
ever, as noted, no interaction is demonstrable. 

Figure 2 presents pre- and postexercise 
mean RS as a function of windchill and also 
presents the mean values for the two 60° F. 
groups. It can be seen that the RS was 
slightly, but consistently greater following 
exercise than before it. Both trends shown 
in this figure exhibit a decrease in RS with 
increasing windchill. The lowest windchill 
result shown, 780 Kg.Cal./m.*/hr., may be 


Table 3 
Analysis of Variance of Ambient Temperature and Windspeed Effects 








Sum of 


Source Squares 


Mean 


Square 


Sum of 
Squares 


Mean 
Square Source 


df 





8.22 


1.51 
6.72 
15.11 
116.08 


131.19 


Wind ignoring temperature 


Interaction 
Temperature 
Between cells 
Within cells 


Total 


2.74 5.58 1 


50 
6.72" 


5.58 Temperature 


ignoring wind 
2.68" Wind 
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Fic. 1, Effects of windspeed on reaction speed. 


suspected of not representing an effect clearly 
relatable to windchill. This group wore the 
arctic clothing in a relatively mild condition 
and there is some possibility, therefore, that 
the result obtained was due at least partly to 
a heat stress rather than anything that might 
be called cold. Support for this possibility 
may be found in Table 4 which presents the 
mean skin temperatures during the reaction 
series and which shows that this group was 
the warmest of all groups. For this reason, 
it does not seem safe to include the result ob- 
tained with this group in the general wind- 
chill trend. Figure 2 also shows that there 
was no essential difference in RS between the 
nude and clothed groups at 60° F. before 
exercise and only a very small difference be- 
tween them after exercise. 

An analysis of variance of the windchill 
effect shown in Fig. 2, omitting the lowest 
windchill group, was carried out on the pre- 
exercise data. This analysis provided an F 
of 5.96 which with 10/364 df is significant at 
less than the .01 level. It may be inferred, 
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Fic. 2. Effects of windchill on reaction speed. 


Table 4 


Mean Weighted Skin Temperature Per Group During 
Pre-Exercise Series and Correlation with RS 








Mean 
Temperature 
oO F 


Windchill 
Kg.Cal./m.2/hr. Ne 





780 56 
1,166 29 
1,359 

* 1,488 14 
1,609 
1,617 12 
1,765 31 
1,873 18 
1,914 
2,100 20 
2,228 10 


90.85 

87.92 

86.62 

85.78 

85.50 

84.98 

85.74 306 
84.37 

83.55 118 
82.76 305 
$1.22 494 
60° F. 


Clothed 35 86.73 .218 


00° F. 


Nude 66 80.24 .332** 





* Number of subjects on whom complete skin temperature 
data were recorded during reaction series, 
< 01. 


therefore, that the decrease in RS with in- 
creasing windchill suggested by Fig. 2 repre- 
sents a nonrandom effect. 

Further inspection of the pre-exercise re- 
sults in Fig. 2, omitting the lowest windchill 
value, suggests that the relationship between 
RS and windchill may be closely approxi- 
mated by a linear function. The least squares 
fit of such a functinn is given by Equation 1: 


RS = 5.59 — .000813W (1) 
where: 


RS = reciprocal RT in sec. 
W = windchill in Kg.Cal./m.*/hr. 


The standard error of fit of Equation 1 is 
.26. The constant, 5.59, which limits the in- 
tercept is equivalent to an RT of .18 sec. 
which is in good accord with the magnitude 
of visual RT to be expected under ideal con- 
ditions. Thus, the equation, though approxi- 
mate, appears to have a reliability and va- 
lidity of value for practical approximation 
purposes. 

Skin temperatures were available for all Ss 
but due to the scanning procedure described, 
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mean values were available for only 359 Ss 
during the first reaction series and a very 
small number of Ss during the second series. 
The mean weighted skin temperature of each 
group and the number of Ss on which the 
mean was based are shown in Table 4. It 
may be seen that the range of the group 
means was relatively small, and that skin tem- 
perature decreased, in general, with increased 
windchill. It can also be seen that the nude 
men as a group had the lowest skin tempera- 
tures of all. 

In order to study the possible relationship 
of RS to skin temperature, Pearson correla- 
tions were computed for each of the condi- 
tions of Table 4. The results are also shown 
in Table 4. All of the coefficients obtained 
were low and only one was significant in a 
probability sense. Assuming that all 13 co- 
efficients are estimates of a zero correlation 
we may ask of the probability of obtaining 
one significantly different from zero at the 
.01 level. This probability is .12 which is too 
high for rejection of the hypothesis. Thus, 
the correlation among the nude men cannot 
be accepted with confidence. In addition to 
the correlations shown in Table 4, a correla- 


tion was computed based on all 359 Ss. A 
coefficient of .18 was obtained which is sig- 


nificant at the .01 level. However, the sig- 
nificance of this correlation is presumably re- 
lated to the significance of the correlation 
obtained with the nude men and, therefore, 
it cannot be accepted with any confidence. 


Discussion 


The results are biased in the sense that an 
increasing percentage of individuals suscep- 
tible to frostbite were removed from the ex- 
periment as the conditions became more 
severe. Nevertheless, a clear and systematic 
impairment of performance was demonstrated, 
an impairment that could not have been less 
and would probably have been greater had 
these Ss not been removed. A further qualifi- 
cation must be made, that all conclusions 
apply to “unacclimatized” men, within about 
75 min. of exposure, not suffering physiologi- 
cal distress. With these qualifications the re- 
sults indicate that RT is not affected by low 
ambient temperature, at least down to — 35° 


F., providing the windspeed does not exceed 
about five mph. On the basis of previous 
conclusions (3, 7), the lower limiting tem- 
perature at low windspeed may be inferred 
to be less than this, at least — 50° F. On 
the other hand, for windspeeds of 10 mph and 
greater, RS decreases with decreased tempera- 
ture at least from — 15° F. and below. It 
was also shown that windspeed has a marked 
effect on RS at least at temperatures of — 15° 
F. and below. Finally, it was demonstrated 
that RS decreases systematically with in- 
creases in windchill. 

Equation 1 provides a first working for- 
mula for application to the design of equip- 
ment and clothing and to the use of men for 
cold-weather conditions. Although it is lim- 
ited to unacclimatized, selected men, and un- 
doubtedly subject to variation with changes 
in clothing, shelter, and the physiological con- 
ditions of individuals, the results suggest that 
the RS function is not importantly based 
upon physiological changes of the individual 
with cold exposure. At least, the lack of cor- 
relation of skin temperatures and, by infer- 
ence, rectal temperatures (1), with observed 
RS differences suggests that the function ob- 
tained was due to other than body heat 
losses. 

One plausible explanation of the results 
may be called the distraction hypothesis. 
This hypothesis assumes that other aspects 
of the environment (wind-produced noise, 
discomfort, and the perceived threat of cold 
exposure) provide competing stimuli which 
interfere with the response elicited by the re- 
action signal and thus produce increased la- 
tencies. The presence of such competing 
stimuli should be most critical during the 
foreperiod of reaction, and, therefore, relat- 
able in a measurable way to the presence 
of nonoptimum preparatory muscular phe- 
nomena (2, 8). 

A distraction hypothesis has interesting im- 
plications. The elicitation strength of dis- 
tracting environmental stimuli should depend 
on their intensity, frequency and duration of 
previous occurrences, conditions of reinforce- 
ment during these occurrences and the anx- 
iety level of the individual; in short, on con- 
ditions of learning. This hypothesis also sug- 
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gests that so-called acclimatized individuals, 
short of marked physiological changes, may 
be individuals who are habituated in a psy- 
chological sense rather than acclimated in a 
physiological sense. Thus, it may be possible 
to speak not only of a physiological cold tol- 
erance, a term which refers to the resistance 
of the individual to the cooling power of the 
environment, but also to a psychological cold 
tolerance and mean by this resistance of the 
individual to the distracting power of the en- 
vironment. The former presumably depends 
upon physiological (circulatory, thalmic, etc.) 
and morphological (body fat, surface area 
and configuration) conditions and character- 
istics of the individual. The latter presum- 
ably depends upon the state of habituation 
of the individual and his anxiety level. 


Summary 


Visual RT’s were elicited from 620 soldiers 
sorted into 14 different groups representing a 
variety of ambient temperatures, windspeeds 
and windchills. Included were two groups at 
60° F., five mph, one of which was nude and 
the other lightly clothed. RT was measured 
after 45 min. of exposure and again following 
a short, mild exercise, after 65 min. of ex- 
posure. In addition, mean area-weighted skin 
temperatures were obtained. The following 
conclusions drawn from the results apply to 
the effects of the cold on “non-acclimatized” 
and/or “non-habituated” men, not in physio- 
logical distress: 

1. At low windspeed, at least up to five 
mph, low ambient temperature has no effect 
on RS, at least down to — 35° F. and prob- 
ably down to — 50° F. 


2. At windspeeds of 10 mph and greater, 
low ambient temperature produces a signifi- 
cant decrease in RS. 

3. Windspeed produces a significant de- 
crease in RS. 

4. Mild exercise produces a small recovery 
in RS. 

5. If men of low “physiological cold toler- 
ance” are removed from the more severe en- 
vironmental conditions and if Ss wear protec- 
tive clothing, RS is essentially a linear de- 
creasing function of windchill. 

6. It was hypothesized that the RS func- 
tion obtained is psychological in nature; a 
specific hypothesis of “psychological cold tol- 
erance” was proposed. 
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Weighted Application Blank Analysis of “Contingency” 
Items 


Thomas A. Mahoney 
Industrial Relations Center, University of Minnesota 


A method of analysis which has become in- 
creasingly common in prediction studies is 
the “weighted application blank analysis.” 
This method of analysis is a fairly simple and 
standardized method for development of pre- 
dictors from personal history or application 
blank information, and for development of 
weights for test scores in prediction. Despite 
the frequent use of this method in prediction 
studies, however, no attention has been given 
to the analysis of “contingency” items or in- 
formation, items where the answer is con- 
tingent upon answers to one or more previous 
questions. This note concerns the analysis 


and weighting of contingency items in the 
weighted application blank analysis. 

The weighted application blank analysis 
begins with identification of ¢riterion groups: 
Group A composed of individuals classed ac- 
ceptable by the criterion, and Group B com- 


posed of individuals classed unacceptable. 
Possible responses to each item of the appli- 
cation blank are categorized for the tabula- 
tion of actual responses. Responses of Group 
A and Group B are tabulated separately for 
the entire list of questions or items. Provi- 
sion is frequently made for individuals who 
fail to respond to a particular item with the 
inclusion of a “no response” category. In 
this manner, a response of some sort can be 
tabulated for each individual in the two cri- 
terion groups. The next step involves calcu- 
lation of the percentage of each group which 
responded in each answer category for each 
of the questions. The same base, total indi- 
viduals in the group, is used in the calcula- 
tion of percentages responding in each answer 
category. The percentage of Group B indi- 
viduals is then subtracted from the percent- 
age of Group A individuals responding in each 
answer category. These percentage differ- 
ences are then transformed into weights for 
a scoring device (1, p. 225, tables). 


Difficulty may arise in the fact that an- 
swers to one or more questions in the appli- 
cation blank are contingent upon answers to 
a previous question. For example, responses 
to a question asking for “number of children”’ 
will be contingent in part upon answers to a 
previous question concerning marital status. 
Analysis of these questions as separate and 
independent questions can result in the as- 
signment of unwarranted weights to certain 
responses of the contingent questions. A spe- 
cific example of this problem is considered 
below: 


Assume that one question or item concerns 
marital status. Possible responses are cate- 
gorized as ‘‘single,” “married,” “other,” and 
“no response.” A second question contingent 
upon the previous one concerns number of 
children. Response categories are: “none,” 
“1-2,” “3-4.” “5 or more,” and “no re- 
sponse.” Responses to these two questions 
might be distributed as indicated in Table 1, 
with resulting weights calculated as also in- 
dicated in Table 1. Note that those who 
responded “single” on the item concerning 
marital status are included in the “none” 
response for number of children along with 
those married individuals who have no chil- 
dren. Since the response “single” is assigned 
a negative weight, the response “none” for 
number of children is also assigned a negative 
weight due to the influence of those individu- 
als who are single. An entirely different 
weighting of responses to number of children 
might have resulted if the single individuals, 
those who “couldn’t respond” to number of 
children, were not considered in the assign- 
ment of weights for the number of children 
question. At the same time, the response 
“single” receives a negative weight twice in 
the example in Table 1—once as a response 
to the marital status item, and once as a re- 
sponse to the number of children item. 
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Table 1 


Assignment of Item Weights 








Number 
Group A Group B 


Marital Status 

21 

63 
16 
0 


Single 
Married 
Other 
No response 
100 
Number Children 
30 
32 
24 
14 
0 


None 

1-2 

34 

5 or more 
No response 


100 


This difficulty can be eliminated through 
development of separate scoring systems for 
the contingent questions, questions to which 
answers may be contingent upon answers to 
previous questions. In the example presented 
here, an additional response, “can’t respond,” 
might be assigned the number of children 
question. This response would include the 
single persons who couldn’t legitimately indi- 
cate any children. These individuals would 
not be considered then in the development of 
scoring weights for the number of children 
question. Table 2 indicates the method for 


Percentage 


Group A Group B 


aataaned Net 
Weight 


%A-AB 


21% 
63% 
16% 

0% 


47% — 26% 
16% 
12% 

2% 


100% 100% 


30% 
32% 
24% 
14% 


0% 


52% 
24% 
14% 
10% 

0% 


100% 100% 


development of a separate scoring system for 
answers to this contingent question. The 
“can't respond” responses are not included in 
the calculation of percentages or in the as- 
signment of weights to various responses. A 
zero weight is assigned to the “can’t respond” 
group having a neutral effect—these persons 
are not penalized or favored again for their 
single status. As indicated in Table 2, sepa- 
ration of the “can’t respond” group from the 
“none” response changes the weights assigned 
the “none” response. Those married persons 
with no children should not be assigned a 


Table 2 


Revised “Contingency Item” Weights 





Number 
Number of 
Children 


Group A Group B 


21 

9 
32 
24 


47 


5 


Can’t respond 
None 

1-2 

34 

5 or more 

No response 


Percentage 


Group A Group B 


Weight 
0 
0 


11% 
41% 
30% 
18% 

0% 


100% 


9% 
45% 
26% 
19% 

0% 


100% 
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negative weight for their lack of children as 
would have happened in the example of 
Table 1. 


The point raised in this note may or may 
not be of much practical importance depend- 
ing on the particular items included in the 
application blank study. A study with few 
or no contingent items would not benefit 
much from the refinement of the method of 
analysis. Refinement of the method to ac- 
count for contingent items in one particular 
study, however, did improve the predictive 
ability of the scoring system. This study is 
one of several studies of personal history pre- 
dictors of management potential conducted 
within the Management Development Labo- 
ratory of the University of Minnesota Indus- 
trial Relations Center. The particular con- 
tingent items covered in this study are: 


Education 
High school organizations 
High school officerships 
High school letters 
Hours worked in high 
school 


College organizations 

College officerships 

College letters 

Hours worked in 
college 


Marital status 


Number of children 
Wife’s education 
Wife’s occupation 


The scoring system and weights developed 
without reference to the fact that responses 
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to certain items were contingent upon re- 
sponses to other items resulted in distribu- 
tions of high and low criterion groups and a 
cutting score which was exceeded by 84% of 
the high criterion group and by 36% of the 
low criterion group. Revision of the weights 
and scoring system to account for the con- 
tingent responses resulted in 87% of the high 
criterion group exceeding the cutting score 
and 35% of the low criterion group exceeding 
the cutting score. The revision to account 
for contingent responses did improve slightly 
the predictive ability of the scoring system. 
The extent of improvement cannot be as- 
sessed accurately without knowledge of the 
number of candidates accepted over a given 
time period. 

The point raised in this note is that an im- 
proved and more efficient method for de- 
veloping a weighted application blank is pro- 
vided through special handling of contingent 
responses. No additional effort is called for 
in this refinement, and the possible improve- 
ment of prediction suggests the value of this 
scoring of contingent responses. 


Received May 3, 1957. 
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Sensitization Versus Adaptation in Preparation for Emer- 
gencies: Prior Experience with an Emergency Ration 
and its Acceptability in a Simulated Survival 
Situation * 


E. Paul Torrance 


Survival Methods Branch, Air Force Personnel and Training Research Center 


Military training programs, child training 
practices, and educational programs involving 
realistic stresses often have been attacked as 
being more damaging than beneficial. De- 
fenders of such programs argue that ex- 
periencing fear-evoking stimuli in simulated 
situations results in greater adaptation by 
“removing the fear of the unknown.”’ The at- 
tackers retort that facing these fear-evoking 
stmuli only replaces a “fear of the unknown” 
with a “fear of the known” and results in 
sensitization or a reinforcement of the fear- 
evoking stimuli. 

The author and his colleagues are conduct- 
ing a series of field studies concerning this 
issue in the realistically simulated survival 
situation of the USAF Survival Training 
School. One controversial issue in this train- 
ing program has concerned the use of a meat 
bar, familiarly known as “pemmican,” as the 
emergency ration in a seven-day simulated 
survival, evasion, and escape exercise. Though 
this ration is highly favored by polar ex- 
plorers, hunters and trappers, and others (8), 
its acceptability has been rather poor in 
ration trials conducted by the United States 
and Canadian Armies (2). In tests con- 
ducted by the Aero Medical Laboratory (6), 
about 12% of the subjects (aircrew person- 
nel undergoing survival training) reported 
that the ration “made them sick.” 


1 This report is based on work done under ARDC 
Project No. 7723, Task No. 77461, in support of the 
research and development program of the Air Force 
Personnel and Training Research Center, Lackland 
Air Force Base, Texas. Permission is granted for re- 
production, translation, publication, use, and dis- 
posal in whole and in part by or for the United 
States Government. The opinions or conclusions ex- 
pressed or implied herein are those of the author. 
They need not be construed as necessarily reflecting 
the views or endorsement of the Department of the 
Air Force or of the Air Research and Development 
Command. 
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In the light of the negative effects revealed 
by the Army and Air Force studies, its actual 
use in a training situation has at times been 
questioned even by those who believe that it 
is the best emergency ration now available. 
They fear that use of the ration in training 
might actually deter individuals from eating 
it in an actual emergency. They maintain 
that an individual in an actual emergency is 
more likely to eat the ration if he has never 
tried it than if he has tried it and disliked it. 
Their argument is strengthened by the high 
probability that an individual trying the ra- 
tion will dislike it. A study by Mason ®* of 
the psychological and training factors affect- 
ing acceptability of this ration, however, sug- 
gested that prior experience with the ration 
is related to higher acceptability. His study 
did not differentiate those who tried the ra- 
tion and liked it from those who had tried it 
and disliked it. Thus, the present study was 
designed to provide more definite informa- 
tion concerning prior experience and _ reac- 
tions to the ration. 


Procedures 
Subjects 


The Ss of the study were 416 aircrewmen under- 
going survival training and may be regarded as nor- 
mal American males, ranging in age from 20 to 40. 
Each S was issued eight meat bars (pemmican) at 
the beginning of a seven-day simulated survival, 
evasion, and escape exercise. This was supplemented 
by about two pounds of beef and a small quantity 
of vegetables, small packets of chili and onion 
powder, 16 cubes of sugar, and eight packets each 
of soluble coffee and tea. Since the training took 
place during the summer in the Plumas National 
Forest, supplementary plant and animal foods were 
available. 


2Mason, R. A survey of the psychological and 
training factors related to survival ration accept- 


ability. Unpublished manuscript 
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Collection of Data 


Following the seven-day exercise, Ss were adminis- 
tered a questionnaire to obtain measures of accept- 
ability and other information concerning the field 
experience. Acceptability items included: 

1. The traditional hedonic scale (7-point in this 
study), requiring the S to indicate his reaction 
(ranging from like extremely to dislike extremely) 
to each of the following five common methods of 
preparing the ration: cold, heated with water only, 
heated with water and chili powder, heated with 
water and onion powder, and cooked in a stew with 
plant and/or animal foods. 

2. The number of meat bars eaten. 

3. Reasons for not eating all of the bars issued 
(if applicable). These included: part lost by acci- 
dent, made me sick, made me thirsty, tasted bad, 
smelled bad, too hard or dry, and too greasy. 

4. Conditions under which the S would eat the 
ration in the future (whenever hungry, only when 
very hungry, and not even if very hungry). 


Analysis of Data 


Ratings on the hedonic scale were weighted from 
“1” (like extremely) to “7” (dislike extremely) 
and summed for the five methods of preparation. 
If an individual indicated that he had not tried the 
ration prepared according to one of the methods, 
this method was assigned the mean rating of the 
methods which had been tried. The number of bars 
eaten was used instead of the number of bars un- 
eaten, as used in previous studies (11, 12), since this 
index makes fuller use of the data. Some Ss bar- 
tered bars from fellow crewmen and members of 
other crews. Reported consumption ranged from 
one-sixteenth of a bar to 25 bars. The only reason 
for not eating all of the bars issued which was 
studied was “made me sick,” since this is consid- 
ered a critical reaction and represents acceptability 
at a “gut” level, as does actual consumption. An 
expressed willingness to “eat it whenever hungry” 
was studied as the critical response in the area of 
probability of future use. 

The inexperienced group was first compared with 
the experienced group (those who had tried the 
ration and liked it and those who had tried it and 
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disliked it) on each of the four criteria. The means 
of the first two criteria were compared by means of 
critical ratios and the number “made sick” and the 
number “willing to eat the ration whenever hungry” 
were compared by chi squares computed according 
to the method described by McNemar (5, pp. 224- 
226). To study further the possible effects of the 
three conditions of prior experience, data on each of 
the four criteria were summarized for each type of 
experience. Data concerning self-evaluated changes 
in attitude were also summarized for each of the 
three types of prior experience. 


Results 


The mean rejection score on the hedonic 
scale, mean number of bars eaten, percentage 
“made sick,” and percentage who would eat 
the ration in the future for the “no experi- 
ence” and “definite experience” groups are 
presented in Table 1 along with appropriate 
tests of significance. It will be noted that 
those with prior experience (regardless of 
whether they liked the ration or not) ex- 
pressed greater liking, ate more bars, less fre- 
quently reported having been made sick, and 
more frequently expressed intentions of eat- 
ing the ration in the future whenever hungry 
than the inexperienced group. All differences 
are significant at better than the 5% level. 
Conclusions concerning the over-all effects of 
prior experience are strengthened by the fact 
that the present sample is heavily loaded with 
Ss who had previously reacted unfavorably 
(74 versus 33). 

Means and standard deviations of scores on 
the hedonic scale and number of bars eaten 
are shown in Table 2. Requirements for 
homogenity of variance are satisfied in the 
case of scores on the hedonic scale but not in 
the case of number of bars eaten, according 
to Bartlett’s Test. It is interesting to note 


Table 1 


Comparison of “No Prior Experience” Group versus “Definite Prior Experience” Group on 
Four Measures of Acceptability of Meat Bar 











Measure 


Experience 
(N = 287) 


No Prior 
Experience 


(N = 107) 


Significance of 
Difference 





Mean rejection score on hedonic scale 

Mean number bars eaten 

Percentage “made sick” 

Percentage “eat in future whenever hungry” 





21.58 

6.22 
34.39 
34.49 


19.97 

7.56 
11.21 
52.34 


<.05 (CR = 2.11) 
<.01 (CR = 3.18) 
<.01 (x? = 8.21) 
<.02 (x* = 6.60) 
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Table 2 


Means and Standard Deviations of Scores on Hedonic Scale and Number of Bars of Meat 
Eaten for Each of Four Conditions of Prior Experience 





Mean Bars 
Eaten 


Mean 


Condition Hedonic 


Number SD* 


No previous experience 287 
Had just tasted 22 
Used and liked 33 
Used and disliked 74 





21.58 
21.20 
14.91 
22.79 





7.10 
7.12 
5.03 
5.45 


6.22 
7.23 
8.54 
6.41 


* Requirements for homogeneity of variance satisfied. 


» Requirements for homogeneity of variance not satisfied. Using Bartlett's Test, chi square = 9.24, p < .02. 


that those with less prior experience tend to 
be more variable in their verbalized attitudes 
(hedonic scale ratings), whereas those with 
most previous experience tend to be more 
variable on the actual consumption criterion. 
This variability is particularly marked in the 
case of those who have used the ration and 
disliked it. 

An analysis of variance was made for scores 
on the hedonic scale, since requirements for 
homogeneity of variance were met. The re- 
sults, shown in Table 3, indicate significant 
variation due to the conditions of previous 
experience. Direct tests were then made by 
use of the critical ratio. As might be antici- 
pated, those who had used the ration and 
liked it expressed more favorable attitudes 
than the other groups (significant at better 
than the 1% level). The important finding, 
however, is that those who had used the ra- 
tion and disliked it did not express signifi- 
cantly different attitudes from those who had 
no previous experience (CR = 1.55, not sig- 
nificant). 

Although requirements for homogeneity of 
variance are not met in the case of number 
of meat bars eaten and an analysis of vari- 


Table 3 


Analysis of Variance Table for Scores on Hedonic Scale 
for Four Conditions of Previous Experience 





Mean 


Square 


499.76 
45.36 


Sum of 
Squares df 


Source of 


Variation F ratio 





1,499.29 3 
18,506.21 408 


20,005.50 411 


11.02 
(p < .001) 


Between 
Within 


Total 


ance cannot be run justifiably, it is at least 
interesting to note that the mean number of 
bars eaten by those who had used the ration 
and disliked it is slightly greater (though cer- 
tainly not significantly so) than the number 
eaten by those with no previous experience. 

Table 4 presents the percentages reporting 
having been “made sick” and the percent- 
ages who say that they will eat the ration 
whenever they are hungry for the conditions 
of experience. For the purposes of this study, 
the most important fact revealed by Table 4 
is that proportionately fewer of those who 
have used the ration and disliked it report 
having been “made sick” than those with no 
previous experience (chi square 5.15, p< 
.0S). Also, it is important to note that pro- 
portionately as many of those who had previ- 
ously disliked the ration state that they will 
eat it whenever hungry as of those who re- 
ported no previous experience in using the 
ration. 

Data concerning reported changes in reac- 
tion for the three experienced groups are 


Table 4 


Percentages “Made Sick” and “Willing to Eat Ration 
Whenever Ifungry” for Each of Four Condi- 
tions of Experience with the Ration 


Percentages 
Eat 
Made Whenever 
Sick Hungry 


Condition Number 





24.39 
13.66 

9.09 
12.16 


Nopreviousexperience 287 
Had just tasted 

Used and liked 

Used and disliked 


34.49 
50.00 
87.88 
36.49 
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Table 5 
Change in Reaction to Meat Bar Reported by Ss with Three Types of Prior Experience 





Liked About Same 





Type of Experience Number Percentage 








Had just tasted 7 
Had eaten and liked 13 
Had eaten and disliked 39 


31.82 
40.62 
54.17 


Total 59 46.82 





Liked Better Liked Worse 


Number Percentage 
50.00 4 
40.62 6 
31.94 


Number Percentage 


18.18 
18.75 
13.89 


47 37.30 20 15.88 


Note.—One in the second category and two in the third category above did not respond to this item. 


shown in Table 5. It is interesting to note 
that about 32% of those who had previously 
eaten and disliked the meat bar liked it better 
“this time.” 

Since other studies (11, 12) have shown 
that a number of psychological, social, and 
training factors are related to acceptability 
of this ration, the experienced group and in- 
experienced group were compared on the fol- 
lowing variables: success in obtaining supple- 
mentary food, effort to obtain supplementary 
food, perceived attitude of instructor, per- 
ceived effort of instructor to influence ac- 
ceptability, and perceived reaction of own 
crew. In every case the distribution of re- 
sponses is so nearly identical for the two 
groups that calculation of tests of significance 
of the difference is unnecessary. 

Although results have been shown in 
Tables 2, 4 and 5 for those who had “just 
tasted the ration,” this group has been elimi- 
nated from the analyses reported herein be- 
cause of the small number and the uncer- 
tainty concerning the nature of this experi- 
ence. The experience of “just tasting the 
ration” in no case negatively affects reactions 
and is accompanied by a consistent, though 
not always significant, positive reaction. 


Discussion 


The results of this study may be inter- 
preted as supporting realistic training as 
preparation for successful adaptation in emer- 
gencies, at least in the area of food indoc- 
trination. Those who had previously used 
pemmican, whether they had liked it or not, 
in comparison with those who had had no 
previous experience with it expressed more 
favorable reactions on the hedonic scale, ate 
a larger number of bars, less frequently re- 


ported having been “made sick,” and ex- 
pressed a more favorable attitude toward its 
future use. Even those who had tried the 
ration and disliked it reacted as favorably as 
those who had had no experience whatsoever 
in using it. If reports of having been “made 
sick” are considered, those who had tried the 
ration and disliked it reacted even more fa- 
vorably than those who had not tried it at all. 

Although related studies have not attacked 
in a direct manner the basic problem posed 
in this paper, such studies suggest that the 
phenomena found in this study may be ex- 
pected to be found. in other areas of human 
behavior. Hudson (1), for example, has sum- 
marized a number of laboratory and field 
studies concerned with anxiety in response to 
the unfamiliar. Hudson maintains that train- 
ing for meeting emergencies is valuable, not 
because it develops the correct behavior pat- 
terns per se but because it provides some 
stability in an otherwise perceptually unstruc- 
tured situation and thereby reduces anxiety. 
Schwartz and Winograd (7) in their studies 
of troop participation in atomic maneuvers 
found that realistic information gained about 
atomic effects are related to changes in atti- 
tudes of confidence or anxiety toward partici- 
pation in atomic maneuvers or warfare. Tor- 
rance (10) has also shown that possession of 
information about how to survive in extreme 
conditions and gains in such information are 
related to expressed confidence in ability to 
survive such situations. 

A study reported by Taylor, Brozek, Hen- 
schel, Mickelson, and Keys (9) provides pos- 
sibly the strongest support of the findings of 
the present study. These experimenters car- 
ried out metabolic, physiological and psycho- 
motor measurements on four men who per- 
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formed hard work under rigidly controlled 
conditions during five successive two-and-one- 
half-day fasts. The successive fasts were 
separated by five- to six-week intervals. Re- 
sults of the first and fifth fasts were com- 
pared. During the second and third days of 
the fasting, all Ss maintained the blood sugar 
at a significantly higher level in the fifth as 
compared to the first fast. Motor speed and 
coordination, reaction time, and pattern trac- 
ing were also superior during the fifth fast 
when compared to the first. 

In both the study of successive fasting and 
the present study, the Ss apparently replaced 
“fears of unknowns” with realistic knowledge 
obtained through actual experience. No doubt 
the men undergoing successive fasts were ini- 
tially anxious concerning what might pos- 
sibly happen to them as a result of fasting. 
After they discovered that they suffered no 
serious ill-effects, their entire systems reacted 
more favorably during later fasts. Pemmican 
also involves something of the unfamiliar. It 
is a somewhat strange looking little bar of 
dried beef and pork mixed with suet. Cer- 
tainly the Ss are not accustomed to eating 
their meat in this form. Using it removes 
this strangeness and results in more favorable 
reactions. Basically the same process is in- 
volved in role playing and psychodrama in 
preparing children to meet new experiences 
(4), preparing individuals for leadership roles 
(3), and preparing hospital patients to adapt 
to outside life. 

Finally, it should be cautioned that though 
generally improved reactions may be expected 
from realistic experiences such as described in 
this, some negative reactions can be expected. 
For example, it will be recalled that those who 
had initially disliked pemmican were more 
variable in their consumption of the ration 
than those who had had no previous experi- 
ence with it. 


Summary 

‘rhe issue of sensitization versus adaptation 
in preparation for emergencies was studied 
in a specific field situation. Four hundred 
and sixteen normal adult males undergoing 
a realistically simulated survival experience 
were issued eight meat bars (pemmican) as 
a part of their emergency ration for the 
seven-day exercise. Ratings for five methods 
of preparation, number of bars consumed, re- 


ports of having been “made sick,” and atti- 
tude toward future use were used as criteria 
of the Ss’ acceptance of the ration. 

The Ss who had previously used the ration, 
regardless of whether they liked or disliked 
it, responded more favorably according to all 
four criteria when compared with those who 
had had no experience with the ration. Even 
those who had tried the ration and disliked it 
responded as favorably as those who had not 
tried it. Fewer of those who had disliked the 
ration reported having been “made sick” by 
the ration than those who had never tried it. 

The results have been interpreted as sup- 
porting arguments in favor of realistically 
simulated training as preparation for adapta- 
tion in emergencies. 


Received May 6, 1957. 
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